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MODIFIED HIV ENV POLYPEPTIDES 



Technical Field 

5 The invention relates generally to modified HIV envelope (Env) polypeptides which 

are useful as immunizing agents or for generating an immune response in a subject, for 
example a cellular immune response or a protective immune response. More particularly, the 
invention relates Env polypeptides such as gpl20, gpl40 or gpl60, wherein at least one of 
the native p-sheet configurations has been modified. The invention also pertains to methods 
1 0 of using these polypeptides to elicit an immune response against a broad range of HIV 
subtypes. 

Background of the Invention 

The human immunodeficiency virus (HIV-1, also referred to as HTLV-HI, LAV or 

15 HTLV-III/LAV) is the etiological agent of the acquired immune deficiency syndrome (AIDS) 
and related disorders, (see, e.g., Barre-Sinoussi, et al., (1983) Science 220:868-871; Gallo et 
al. (1984) Science 224:500-503; Levy et al., (1984) Science 225:840-842; Siegal et al., (1981) 
N. Engl. J. Med. 305:1439-1444). AIDS patients usually have a long asymptomatic period 
followed by the progressive degeneration of the immune system and the central nervous 

20 system. Replication of the virus is highly regulated, and both latent and lytic infection of the 
CD4 positive helper subset of T-lymphocytes occur in tissue culture (Zagury et al., (1986) 
Science 231:850-853). Molecular studies of HIV-1 show that it encodes a number of genes 
(Ratner et al., (1985) Nature 313:277-284; Sanchez-Pescador et al., (1985) Science 227:484- 
492), including three structural genes - gag, pol and env - that are common to all 

25 retroviruses. Nucleotide sequences from viral genomes of other retroviruses, particularly 

HIV-2 and simian immunodeficiency viruses, SIV (previously referred to as STLV-III), also 
contain these structural genes. (Guyader et al., (1987) Nature 326:662-669; Chakrabarti et 
al., (1987) Mm/re 

The envelope protein of HIV-1, HIV-2 and SIV is a glycoprotein of about 160 kd 
30 (gpl60). During virus infection of the host cell, gpl60 is cleaved by host cell proteases to 
fonn gp!20 and the integral membrane protein, gp41 . The gp41 portion is anchored in the 
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membrane bilayer of virion, while the gp!20 segment protrudes into the surrounding 
environment. gpl20 and gp41 are more covalently associated and free gpl20 can be released 
from the surface of virions and infected cells. 

As depicted in Figure 1, crystallography studies of the gpl20 core polypeptide 
5 indicate that this polypeptide is folded into two major domains having certain emanating 
structures. The inner domain (inner with respect to the N and C terminus) features a two- 
helix, two-stranded bundle with a small five-stranded P-sandwich at its termini-proximal end 
and a projection at the distal end from which the V1/V2 stem emanates. The outer domain is 
a staked double barrel that lies along side the inner domain so that the outer barrel and inner 

10 bundle axes are approximately parallel. Between the distal inner domain and the distal outer 
domain is a four-stranded bridging sheet which holds a peculiar minidomain in contact with, 
but distinct from, the inner, the outer domain, and the VW2 domain. The bridging sheet is 
composed of four p-strand structures (p-3, P-2, p-21, P-20, shown in Figure 1). The bridging 
region can be seen in Figure 1 packing primarily over the inner domain, although some 

1 5 surface residues of the outer domain, such as Phe 382, reach into the bridging sheet to form 
part of its hydrophobic core. 

The basic unit of the P-sheet conformation of the bridging sheet region is the P-strand 
which exists as a less tightly coiled helix, with 2.0 residues per turn. The P-strand 
conformation is only stable when incorporated into a p-sheet, where hydrogen bonds with 

20 close to optimal geometry are formed between the peptide groups on adjacent P-strands; the 
dipole moments of the strands are also aligned favorably. Side chains from adjacent residues 
of the same strand protrude from opposite sides of the sheet and do not interact with each 
other, but have significant interactions with their backbone and with the side chains of 
neighboring strands. For a general description of P-sheets, see, e.g., T.E. Creighton, Proteins: 

25 Structures and Molecular Properties (W.H. Freeman and Company, 1993); and A.L. 
Lehninger, Biochemistry (Worth Publishers, Inc., 1975). 

The gpl20 polypeptide is instrumental in mediating entry into the host cell. Recent 
studies have indicated that binding of CD4 to gpl20 induces a conformational change in Env 
that allows for binding to a co-receptor (eg, a chemokine receptor) and subsequent entry of 

30 the virus into the cell. (Wyatt, R., et al. (1998) Nature 393:705-71 1 ; Kwong, P., et al.(l 998) 
Nature 393:648-659). Referring again to Figure 1, CD4 is bound into a depression formed at 
the interface of the outer domain, the inner domain and the bridging sheet of gpl20. 
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Immunogenicity of the gp 120 polypeptide has also been studied. For example, 
individuals infected by HIV-1 usually develop antibodies that can neutralize the virus in in 
vitro assays, and this response is directed primarily against linear neutralizing determinants in 
the third variable loop of gpl20 glycoprotein (Javaherian, K., et al. (1989) Proc. Natl Acad. 
5 Sci. 86:6786-6772; Matsushita, M., et al. (1988) J. Virol. 62:2107-2144; Putney, S., et al. 
(1986) Science 234:1392-1395; Rushe, J. R., et al . (1988) Proc. Nat. Acad. Sci. USA 85: 
3198-3202.). However, these antibodies generally exhibit the ability to neutralize only a 
limited number of HIV-1 strains (Matthews, T. (1986) Proc. Natl Acad. Sci. USA. 83:9709- 
9713; Nara, P. L., et al. (1988) J. Virol. 62:2622-2628; Palker, T. J., et al. (1988) Proc. Natl 
10 Acad. Sci. USA. 85:1932-1936). Later in the course of HIV infection in humans, antibodies 
capable of neutralizing a wider range of HIV-1 isolates appear (Barre-Sinoussi, F., et al. 
(1983) Science 220:868-871; Robert-Guroff, M., et al. (1985) Nature (London) 316:72-74; 
Weis, R., et al. (1985) Nature (London) 316:69-72; Weis, R., et al. (1986) Nature (London) 
324:572-575). 

1 5 Recent work done by S tamatatos et al ( 1 998) AIDS Res Hum Retroviruses 

14(13):1 129-39, shows that a deletion of the variable region 2 from a HIV-1 SF , 62 virus, which 
utilizes the CCR-5 co-receptor for virus entry, rendered the virus highly susceptible to serum- 
mediated neutralization. This V2 deleted virus was also neutralized by sera obtained from 
patients infected not only with clade B HTV-1 isolates but also with clade A, C, D and F HIV- 

20 1 isolates. However, deletion of the variable region 1 had no effect. Deletion of the variable 
regions 1 and 2 from a LAI isolate HIV-I IHB also increased the susceptibility to neutralization 
by monoclonal antibodies whose epitopes are located within the V3 loop, the CD4-binding 
site, and conserved gpl20 regions (Wyatt, R., et al. (1995) J Virol 69:5723-5733). Rabbit 
immunogenicity studies done with the HIV-1 virus with deletions in the VI /V2 and V3 

25 region from the LAI strain, which uses the CXCR4 co-receptor for virus entry, showed no 
improvement in the ability of Env to raise neutralizing antibodies (Leu et al. (1998) AIDS 
Res. and Human Retroviruses. 14:151-155). 

Further, a subset of the broadly reactive antibodies, found in most infected 
individuals, interferes with the binding of gp!20 and CD4 (Kang, C.-Y., et al. (1991) Proc. 

30 Natl. Acad. Sci. USA. 88:6171-6175; McDougal, J. S., et al. (1986) J. Immunol. 137:2937- 
2944). Other antibodies are believed to bind to the chemokine receptor binding region after 
CD4 has bound to Env (Thali et al. (1993) J. Virol 67:3978-3988). The fact that neutralizing 
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antibodies generated during the course of HIV infection do not provide permanent antiviral 
effect may in part be due to the generation of "neutralization escapes" virus mutants and to 
the general decline in the host immune system associated with pathogenesis. In contrast, the 
presence of pre-existing neutralizing antibodies upon initial HIV-1 exposure will likely have 
5 a protective effect. 

It is widely thought that a successful vaccine should be able to induce a strong, 
broadly neutralizing antibody response against diverse HIV-1 strains (Montefiori and Evans 
(1999) AIDS Res. Hum. Ret. 15(8):689-698; Bolognesi, D.,P., et al. (1994) Ann. Int. Med. 
8:603-61 1 ; Haynes, B„ F., et al. (1996) Science ;271: 324-328.). Neutralizing antibodies, by 

10 attaching to the incoming virions, can reduce or even prevent their infectivity for target cells 
and prevent the cell-to-cell spread of virus in tissue culture (Hu et al. (1992) Science 255:456- 
459; Burton, D.,R. and Montefiori, D. (1997) AIDS ll(suppl. A): 587-598). However as 
described above, antibodies directed against gpl20 do not generally exhibit broad antibody 
responses against different HIV strains. 

1 5 Currently, the focus of vaccine development, from the perspective of humoral 

immunity, is on the neutralization of primary isolates that utilize the CCR5 chemokine co- 
receptor believed to be important in virus entry (Zhu, T., et al. (1993) Science 261:1 179- 
1181; Fiore, J., et al. (1994) Virology; 204:297-303). These viruses are generally much more 
resistant to antibody neutralization than T-cell line adapted strains that use the CXCR4 co- 

20 receptor, although both can be neutralized in vitro by certain broadly and potent acting 

monoclonal antibodies, such as lgGlbl2, 2G12 and 2F5 (Trkola, A., et al. (1995) J. Virol. 
69:6609-6617; D'Sousa PM., et al (1997)7. Infect. Dis. 175:1062-1075). These monoclonal 
antibodies are directed to the CD4 binding site, a glycosylation site and to the gp41 fusion 
domain, respectively. The problem that remains, however, is that it is not known how to 

25 induce antibodies of the appropriate specificity by vaccination. Antibodies (Abs) elicited by 
gpl20 glycoprotein from a given isolate are usually only able to neutralize closely related 
viruses generally from similar, usually from the same, HIV-1 subtype. 

Despite the above approaches, there remains a need for Env antigens that can elicit an 
immunological response (e.g., neutralizing and/or protective antibodies) in a subject against 

30 multiple HIV strains and subtypes, for example when administered as a vaccine. The present 
invention solves these and other problems by providing modified Env polypeptides (e.g., 
gpl20) to expose epitopes in or near the CD4 binding site. 
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Summary of the Invention 

In accordance with the present invention, modified HIV Env polypeptides are 
provided. In particular, deletions and/or mutations are made in one or more of the 4~p 
antiparallel-bridging sheet in the HIV Env polypeptide. In this way, enough structure is left 
5 to allow correct folding of the polypeptide, for example of gpl20, yet enough of the bridging 
sheet is removed to expose the CD4 groove, allowing an immune response to be generated 
against epitopes in or near the CD4 binding site of the Env polypeptide (e.g.. gpl20). 

In one aspect, the invention includes a polynucleotide encoding a modified HIV Env 
polypeptide wherein the polypeptide has at least one modified (e.g., deleted or replaced) 

10 amino acid residue deleted in the region corresponding to residues 421 to 436 relative to 
HXB-2, for example the constructs depicted in Figures 6-29 (SEQ ID NOs:3 to 26). In 
certain embodiments, the polynucleotide also has the region corresponding to residues 124- 
198 of the polypeptide HXB-2 (e.g., V 1/V2) deleted and at least one amino acid deleted or 
replaced in the regions corresponding to the residues 1 19 to 123 and 199 to 210, relative to 

15 HXB-2. In other embodiments, these polynucleotides encode Env polypeptides having at 
least one amino acid of the small loop of the bridging sheet (e.g., amino acid residues 427 to 
429 relative to HXB-2) deleted or replaced. The amino acid sequences of the modified 
polypeptides encoded by the polynucleotides of the present invention can be based on any 
HIV variant, for example SF162. 

20 In another aspect, the invention includes immunogenic modified HIV Env 

polypeptides having at least one modified (e.g., deleted or replaced) amino acid residue 
deleted in the region corresponding to residues 421 to 436 relative to HXB-2, for example a 
deletion or replacement of one amino acids in the small loop region (e.g., amino acid residues 
427 to 429 relative to HXB-2). These polypeptides may have modifications (e.g., a deletion 

25 or a replacement) of at least one amino acid between about amino acid residue 420 and amino 
acid residue 436, relative to HXB-2 and, optionally, may have deletions or truncations of the 
VI and/or V2 regions. The immunogenic, modified polypeptides of the present invention can 
be based on any HIV variant, for example SF162. 

In another aspect, the invention includes a vaccine composition comprising any of the 

30 polynucleotides encoding modified Env polypeptides described above. Vaccine 

compositions comprising the modified Env polypeptides and, optionally, an adjuvant are also 
included in the invention. 
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In yet another aspect, the invention includes a method of inducing an immune 
response in subject comprising, administering one or more of the polynucleotides or 
constructs described above in an amount sufficient to induce an immune response in the 
subject. In certain embodiments, the method further comprises administering an adjuvant to 
5 the subject. 

In another aspect, the invention includes a method of inducing an immune response in 
a subject comprising administering a composition comprising any of the modified Env 
polypeptides described above and an adjuvant. The composition is administered in an 
amount sufficient to induce an immune response in the subject. 
10 In another aspect, the invention includes a method of inducing an immune response in 

a subject comprising 

(a) administering a first composition comprising any of the polynucleotides described 
above in a priming step and 

(b) administering a second composition comprising any of the modified Env 

15 polypeptides described above, as a booster, in an amount sufficient to induce an immune 
response in the subject. In certain embodiments, the first composition, the second 
composition or both the first and second compositions further comprise an adjuvant. 

These and other embodiments of the subject invention will readily occur to those of 
skill in the art in light of the disclosure herein. 

20 

Brief Description of the Drawings 

Figure 1 is a schematic depiction of the tertiary structure of the HIV-1 HXB _, Env gpl20 
polypeptide, as determined by crystallography studies. 

Figures 2A-C depict alignment of the amino acid sequence of wild-type HIV-1 HXB _ 2 
25 Env gpl60 polypeptide (SEQ ID NO:l) with amino acid sequence of HIV variants SF162 
(shown as "162") (SEQ ID NO:2), SF2, CM236 and US4. Arrows indicate the regions that 
are deleted or replaced in the modified polypeptides. Black dots indicate conserved cysteine 
residues. The star indicates the position of the last amino acid in gp!20. 

Figures 3A-J depict alignment of nucleotide sequences of polynucleotides encoding 
30 modified Env polypeptides having V1/V2 deletions. The unmodified amino acid residues 

encoded by these sequences correspond to wildtype SF162 residues but are numbered relative 
to HXB-2. 
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Figures 4A-M depict alignment of nucleotide sequences of polynucleotides encoding 
modified Env polypeptides having deletions or replacements in the small loop. The 
unmodified amino acid residues encoded by these sequences correspond to wildtype SF162 
residues but are numbered relative to HXB-2. 
5 Figures 5A-N depict alignment of nucleotide sequences of polynucleotides encoding 

modified Env polypeptides having both VI /V2 deletions and, in addition, deletions or 
replacements in the small loop. The unmodified amino acid residues encoded by these 
sequences correspond to wildtype SF162 residues but are numbered relative to HXB-2. 

Figure 6 depicts the nucleotide sequence of the construct designated Vall20-Ala204 
10 (SEQIDNO:3). 

Figure 7 depicts the nucleotide sequence of the construct designated Vall20-Ile201 
(SEQ ID NO:4). 

Figure 8 depicts the nucleotide sequence of the construct designated Vall20-Ile201B 
(SEQ ID NO:5). 

1 5 Figure 9 depicts the nucleotide sequence of the construct designated Lys 1 2 1 -Val200 

(SEQ ID NO:6). 

Figure 10 depicts the nucleotide sequence of the construct designated Leul22-Serl99 
(SEQ ID NO:7). 

Figure 1 1 depicts the nucleotide sequence of the construct designated Vall20-Thr202 
20 (SEQ ID NO:8). 

Figure 12 depicts the nucleotide sequence of the construct designated Trp427-Gly431 
(SEQ ID NO:9), 

Figure 13 depicts the nucleotide sequence of the construct designated Arg426-Gly43 1 
(SEQ ID NO: 10). 

25 Figure 14 depicts the nucleotide sequence of the construct designated Arg426- 

Gly431B(SEQ ID NO: 11). 

Figure 15 depicts the nucleotide sequence of the construct designated Arg426-Lys432 
(SEQ ID NO: 12). 

Figure 16 depicts the nucleotide sequence of the construct designated Asn425-Lys432 
30 (SEQ ID NO: 13). 

Figure 17 depicts the nucleotide sequence of the construct designated Ile424-Ala433 
(SEQ ID NO: 14). 
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Figure 18 depicts the nucleotide sequence of the construct designated Ile423-Met434 
(SEQIDNO:15). 

Figure 19 depicts the nucleotide sequence of the construct designated Gln422-Tyr435 
(SEQIDNO:16). 

5 Figure 20 depicts the nucleotide sequence of the construct designated Gln422- 

Tyr435B(SEQIDNO:17). 

Figure 21 depicts the nucleotide sequence of the construct designated Leul22- 
Serl 99;Arg426-Gly43 1 (SEQ ID NO: 1 8). 

Figure 22 depicts the nucleotide sequence of the construct designated Leul22- 
10 Serl99;Arg426-Lys432 (SEQ ID NO: 19). 

Figure 23 depicts the nucleotide sequence of the construct designated Leul22-Serl99; 
Trp427-Gly431 (SEQ ID NO:20). 

Figure 24 depicts the nucleotide sequence of the construct designated Lysl21-Val200; 
Asn425-Lys432 (SEQ ID NO:21). 
1 5 Figure 25 depicts the nucleotide sequence of the construct designated Vail 20-Ile201 ; 

Ile424-Ala433 (SEQ ID NO:22). 

Figure 26 depicts the nucleotide sequence of the construct designated Vall20- 
Ile201B: Ile424-Ala433 (SEQ ID NO:23). 

Figure 27 depicts the nucleotide sequence of the construct designated Vall20-Thr202; 
20 Ue424-Ala433 (SEQ ID NO:24). 

Figure 28 depicts the nucleotide sequence of the construct designated VaI127-Asnl95 
(SEQ ID NO:25). 

Figure 29 depicts the nucleotide sequence of the construct designated Vail 27- 
Asnl95; Arg426-Gly43 1 (SEQ ID NO:26). 

25 

Detailed Description of the Invention 

The practice of the present invention will employ, unless otherwise indicated, 
conventional methods of protein chemistry, viral immunobiology, molecular biology and 
recombinant DNA techniques within the skill of the art. Such techniques are explained fully 
30 in the literature. See, e.g., T.E. Creighton, Proteins: Structures and Molecular Properties 
(W.H. Freeman and Company, 1993); Nelson L.M and Jerome H.K. HIV Protocols in 
Methods in Molecular Medicine, vol. 17, 1999; Sambrook, et al. t Molecular Clonine: A 
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Laboratory Manual (Cold Spring Harbor Laboratory, 1989); F.M. Ausubel et al. Current 
Protocols in Molecular Biology. Greene Publishing Associates & Wiley Interscience New 
York; and Lipkowitz and Boyd, Reviews in Computational Chemistry , volumes 1 -present 
(Wiley- VCH, New York, New York, 1999). 
5 It must be noted that, as used in this specification and the appended claims, the 

singular forms "a", "an" and "the" include plural referents unless the content clearly dictates 
otherwise. Thus, for example, reference to "a polypeptide" includes a mixture of two or more 
polypeptides, and the like. 

10 Definitions 

In describing the present invention, the following terms will be employed, and are 

intended to be defined as indicated below. 

The terms "polypeptide," and "protein" are used interchangeably herein to denote any 

polymer of amino acid residues. The terms encompass peptides, oligopeptides, dimers, 
15 multimers, and the like. Such polypeptides can be derived from natural sources or can be 

synthesized or recombinantly produced. The terms also include postexpression modifications 

of the polypeptide, for example, glycosylation, acetylation, phosphorylation, etc. 

A polypeptide as defined herein is generally made up of the 20 natural amino acids 

Ala (A), Arg (R), Asn (N), Asp (D), Cys (C), Gin (Q), Glu (E), Gly (G), His (H), He (I), Leu 
20 (L), Lys (K), Met (M), Phe (F), Pro (P), Ser (S), Thr (T), Trp (W), Tyr (Y) and Val (V) and 

may also include any of the several known amino acid analogs, both naturally occurring and 

synthesized analogs, such as but not limited to homoisoleucine, asaleucine, 2- 

(methylenecyclopropyl)glycine, S-methylcysteine, S-(prop-l-enyl)cysteine, homoserine, 

ornithine, norleucine, norvaline, homoarginine, 3-(3-carboxyphenyl)alanine, 
25 cyclohexylalanine, mimosine, pipecolic acid, 4-methylglutamic acid, canavanine, 2,3- 

diaminopropionic acid, and the like. Further examples of polypeptide agents which will find 

use in the present invention are set forth below. 

By "geometry" or "tertiary structure" of a polypeptide or protein is meant the overall 

3-D configuration of the protein. As described herein, the geometry can be determined, for 
30 example, by crystallography studies or by using various programs or algorithms which 

predict the geometry based on interactions between the amino acids making up the primary 

and secondary structures. 



9 



WO 00/39303 PCT/US99/31272 

By "wild type" polypeptide, polypeptide agent or polypeptide drug, is meant a 
naturally occurring polypeptide sequence, and its corresponding secondary structure. An 
"isolated" or ,, purified ,, protein or polypeptide is a protein which is separate and discrete from 
a whole organism with which the protein is normally associated in nature. It is apparent that 
the term denotes proteins of various levels of purity. Typically, a composition containing a 
purified protein will be one in which at least about 35%, preferably at least about 40-50%, 
more preferably, at least about 75-85%, and most preferably at least about 90% or more, of 
the total protein in the composition will be the protein in question. 

By "Env polypeptide" is meant a molecule derived from an envelope protein, 
preferably from HIV Env. The envelope protein of HIV- 1 is a glycoprotein of about 1 60 kd 
(gpl60). During virus infection of the host cell, gpl60 is cleaved by host cell proteases to 
form gpl20 and the integral membrane protein, gp41 . The gp41 portion is anchored in (and 
spans) the membrane bilayer of virion, while the gpl20 segment protrudes into the 
surrounding environment. As there is no covalent attachment between gpl20 and gp41, free 
gpl20 is released from the surface of virions and infected cells. Env polypeptides may also 
include gp!40 polypeptides. Env polypeptides can exist as monomers, dimers or mul timers. 

By a M gpl20 polypeptide" is meant a molecule derived from a gpl20 region of the 
Env polypeptide. Preferably, the gpl20 polypeptide is derived from HIV Env. The primary 
amino acid sequence of gpl20 is approximately 511 amino acids, with a polypeptide core of 
about 60,000 daltons. The polypeptide is extensively modified by N-linked glycosylation to 
increase the apparent molecular weight of the molecule to 120,000 daltons. The amino acid 
sequence of gpl20 contains five relatively conserved domains interspersed with five 
hypervariable domains. The positions of the 18 cysteine residues in the gpl20 primary 
sequence of the HIV-1 HXB . 2 (hereinafter M HXB-2") strain, and the positions of 13 of the 
approximately 24 N-linked glycosylation sites in the gpl20 sequence are common to most, if 
not all, gpl20 sequences. The hypervariable domains contain extensive amino acid 
substitutions, insertions and deletions. Despite this variation, most, if not all, gpl20 
sequences preserve the virus's ability to bind to the viral receptor CD4. A "gpl20 
polypeptide" includes both single subunits or multimers. 

Env polypeptides {e.g., gpl20, gp!40 and gpl60) include a "bridging sheet" 
comprised of 4 anti-parallel P-strands (p-2, p-3, P-20 and p-21) that form a P-sheet. 
Extruding from one pair of the p-strands (P-2 and P-3) are two loops, VI and V2. The P-2 
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sheet occurs at approximately amino acid residue 119 (Cys) to amino acid residue 123 (Thr) 
while P-3 occurs at approximately amino acid residue 199 (Ser) to amino acid residue 201 
(He), relative to HXB-2. The "V1/V2 region" occurs at approximately amino acid positions 
126 (Cys) to residue 196 (Cys), relative to HXB-2. (see, e.g., Wyatt et al. (1995) J. Virol. 
5 69:5723-5733; Stamatatos et al. (1998) J. Virol. 72:7840-7845). Extruding from the second 
pair of p-strands (P-20 and p-21) is a "small-loop" structure, also referred to herein as "the 
bridging sheet small loop." In HXB-2, p-20 extends from about amino acid residue 422 
(Gin) to amino acid residue 426 (Met) while p-21 extends from about amino acid residue 430 
(Val) to amino acid residue 435 (Tyr). In variant SF162, the Met-426 is an Arg (R) residue. 

10 The "small loop" extends from about amino acid residue 427 (Trp) through 429 (Lys), 

relative to HXB-2. A representative diagram of gpl20 showing the bridging sheet, the small 
loop, and VI /V2 is shown in Figure 1 . In addition, alignment of the amino acid sequences of 
Env polypeptide gpl60 of selected variants is shown, relative to HXB-2, in Figures 2A-C. 
Furthermore, an "Env polypeptide" or "gpl20 polypeptide" as defined herein is not 

1 5 limited to a polypeptide having the exact sequence described herein. Indeed, the HIV 

genome is in a state of constant flux and contains several variable domains which exhibit 
relatively high degrees of variability between isolates. It is readily apparent that the terms 
encompass Env (e.g., gpl20) polypeptides from any of the identified HIV isolates, as well as 
newly identified isolates, and subtypes of these isolates. Descriptions of structural features 

20 are given herein with reference to HXB-2. One of ordinary skill in the art in view of the 

teachings of the present disclosure and the art can determine corresponding regions in other 
HIV variants (e.g., isolates HIV 1IIb , HIV SF2 , HIV-1 SFI63 , HIV-1 SF170 , HIV LAV , HIV LAI , HIV MN , 
HrV-l CM235 „ HIV-1 US4 , other HIV-1 strains from diverse subtypes(e.g., subtypes, A through 
G, and O), HIV-2 strains and diverse subtypes (e.g., HIV-2 UC1 and HIV-2 UC2 ), and simian 

25 immunodeficiency virus (SIV). (See, e.g., Virology, 3rd Edition (W.K. Joklik ed. 1988); 

Fundamental Virology^ 2nd Edition (B.N. Fields and D.M. Knipe, eds. 1991); Virology, 3rd 
Edition (Fields, BN, DM Knipe, PM Howley, Editors, 1996, Lippincott-Raven, Philadelphia, 
PA; for a description of these and other related viruses), using for example, sequence 
comparison programs (e.g., BLAST and others described herein) or identification and 

30 alignment of structural features (e.g., a program such as the "ALB" program described herein 
that can identify P-sheet regions). The actual amino acid sequences of the modified Env 
polypeptides can be based on any HIV variant. 
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Additionally, the term "Env polypeptide" (e.g., "gpl20 polypeptide") encompasses 
proteins which include additional modifications to the native sequence, such as additional 
internal deletions, additions and substitutions. These modifications may be deliberate, as 
through site-directed mutagenesis, or may be accidental, such as through naturally occurring 
5 mutational events. Thus, for example, if the Env polypeptide is to be used in vaccine 

compositions, the modifications must be such that immunological activity (i.e., the ability to 
elicit an antibody response to the polypeptide) is not lost. Similarly, if the polypeptides are to 
be used for diagnostic purposes, such capability must be retained. 

Thus, a "modified Env polypeptide" is an Env polypeptide (e.g., gpl20 as defined 

1 0 above), which has been manipulated to delete or replace all or a part of the bridging sheet 
portion and, optionally, the variable regions VI and V2. Generally, modified Env (e.g. t 
gpl20) polypeptides have enough of the bridging sheet removed to expose the CD4 binding 
site, but leave enough of the structure to allow correct folding (e.g., correct geometry). Thus, 
modifications to the p-20 and P-21 regions (between about amino acid residues 420 and 435 

1 5 relative to HXB-2) are preferred. Additionally, modifications to the p-2 and P-3 regions 
(between about amino acid residues 119 (Cys) and 201 (He)) and modifications (e.g., 
truncations) to the VI and V2 loop regions may also be made. Although not all possible P- 
sheet and VI N2 modifications have been exemplified herein, it is to be understood that other 
disrupting modifications are also encompassed by the present invention. 

20 Normally, such a modified polypeptide is capable of secretion into growth medium in 

which an organism expressing the protein is cultured. However, for purposes of the present 
invention, such polypeptides may also be recovered intracellularly. Secretion into growth 
media is readily determined using a number of detection techniques, including, e.g.,. 
polyacrylamide gel electrophoresis and the like, and immunological techniques such as 

25 Western blotting and immunoprecipitation assays as described in, e.g., International 
Publication No. WO 96/043 01, published February 15, 1996. 

A gpl20 or other Env polypeptide is produced "intracellularly" when it is found 
within the cell, either associated with components of the cell, such as in association with the 
endoplasmic reticulum (ER) or the Golgi Apparatus, or when it is present in the soluble 

30 cellular fraction. The gpl20 and other Env polypeptides of the present invention may also be 
secreted into growth medium so long as sufficient amounts of the polypeptides remain 
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present within the cell such that they can be purified from cell lysates using techniques 
described herein. 

An "immunogenic" gp!20 or other Env protein is a molecule that includes at least one 
epitope such that the molecule is capable of either eliciting an immunological reaction in an 
5 individual to which the protein is administered or, in the diagnostic context, is capable of 
reacting with antibodies directed against the HIV in question. 

By "epitope" is meant a site on an antigen to which specific B cells and/or T cells 
respond, rendering the molecule including such an epitope capable of eliciting an 
immunological reaction or capable of reacting with HIV antibodies present in a biological 

1 0 sample. The term is also used interchangeably with "antigenic determinant" or "antigenic 
. determinant site." An epitope can comprise 3 or more amino acids in a spatial conformation 
unique to the epitope. Generally, an epitope consists of at least 5 such amino acids and, more 
usually, consists of at least 8-10 such amino acids. Methods of determining spatial 
conformation of amino acids are known in the art and include, for example, x-ray 

1 5 crystallography and 2-dimensional nuclear magnetic resonance. Furthermore, the 

identification of epitopes in a given protein is readily accomplished using techniques well 
known in the art, such as by the use of hydrophobicity studies and by site-directed serology. 
See, also, Geysen et al., Proc. Natl. Acad. Sci. USA (1984) 81:3998-4002 (general method of 
rapidly synthesizing peptides to determine the location of immunogenic epitopes in a given 

20 antigen); U.S. Patent No. 4,708,871 (procedures for identifying and chemically synthesizing 
epitopes of antigens); and Geysen et al., Molecular Immunology (1986) 23:709-715 
(technique for identifying peptides with high affinity for a given antibody). Antibodies that 
recognize the same epitope can be identified in a simple immunoassay showing the ability of 
one antibody to block the binding of another antibody to a target antigen. 

25 An "immunological response" or "immune response" as used herein is the 

development in the subject of a humoral and/or a cellular immune response to the Env (e.g., 
gp!20) polypeptide when the polypeptide is present in a vaccine composition. These 
antibodies may also neutralize infectivity, and/or mediate antibody-complement or antibody 
dependent cell cytotoxicity to provide protection to an immunized host. Immunological 

30 reactivity may be determined in standard immunoassays, such as a competition assays, well 
known in the art. 
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Techniques for determining amino acid sequence "similarity" are well known in the 
art. In general, "similarity" means the exact amino acid to amino acid comparison of two or 
more polypeptides at the appropriate place, where amino acids are identical or possess similar 
chemical and/or physical properties such as charge or hydrophobicity. A so-termed "percent 
5 similarity" then can be determined between the compared polypeptide sequences. 

Techniques for determining nucleic acid and amino acid sequence identity also are well 
known in the art and include determining the nucleotide sequence of the mRNA for that gene 
(usually via a cDNA intermediate) and determining the amino acid sequence encoded 
thereby, and comparing this to a second amino acid sequence. In general, "identity" refers to 

10 an exact nucleotide to nucleotide or amino acid to amino acid correspondence of two 
polynucleotides or polypeptide sequences, respectively. 

Two or more polynucleotide sequences can be compared by determining their 
"percent identity." Two or more amino acid sequences likewise can be compared by 
determining their "percent identity." The percent identity of two sequences, whether nucleic 

15 acid or peptide sequences, is generally described as the number of exact matches between two 
aligned sequences divided by the length of the shorter sequence and multiplied by 100. An 
approximate alignment for nucleic acid sequences is provided by the local homology 
algorithm of Smith and Waterman, Advances in Applied Mathematics 2:482-489 (1981). 
This algorithm can be extended to use with peptide sequences using the scoring matrix 

20 developed by Dayhofif, Atlas of Protein Sequences and Structure, M.O. Dayhoff ed., 5 suppl. 
3:353-358, National Biomedical Research Foundation, Washington, D.C., USA, and 
normalized by Gribskov, Nucl. Acids Res. 14(6):6745-6763 (1986). An implementation of 
this algorithm for nucleic acid and peptide sequences is provided by the Genetics Computer 
Group (Madison, WI) in their BestFit utility application. The default parameters for this 

25 method are described in the Wisconsin Sequence Analysis Package Program Manual, Version 
8 (1995) (available from Genetics Computer Group, Madison, WI). Other equally suitable 
programs for calculating the percent identity or similarity between sequences are generally 
known in the art. 

For example, percent identity of a particular nucleotide sequence to a reference 
30 sequence can be determined using the homology algorithm of Smith and Waterman with a 
default scoring table and a gap penalty of six nucleotide positions. Another method of 
establishing percent identity in the context of the present invention is to use the MPSRCH 
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package of programs copyrighted by the University of Edinburgh, developed by John F. 
Collins and Shane S. Sturrok, and distributed by IntelliGenetics, Inc. (Mountain View, CA). 
From this suite of packages, the Smith- Waterman algorithm can be employed where default 
parameters are used for the scoring table (for example, gap open penalty of 12, gap extension 
penalty of one, and a gap of six). From the data generated, the "Match" value reflects 
"sequence identity." Other suitable programs for calculating the percent identity or similarity 
between sequences are generally known in the art, such as the alignment program BLAST, 
which can also be used with default parameters. For example, BLASTN and BLASTP can be 
used with the following default parameters: genetic code = standard; filter = none; strand = 
both; cutoff = 60; expect = 10; Matrix = BLOSUM62; Descriptions = 50 sequences; sort by = 
HIGH SCORE; Databases = non-redundant, GenBank + EMBL + DDBJ + PDB + GenBank 
CDS translations + Swiss protein + Spupdate + PIR. Details of these programs can be found 
at the following internet address: http://www.ncbi.nlm.gov/cgi-bin/BLAST. 

One of skill in the art can readily determine the proper search parameters to use for a 
given sequence in the above programs. For example, the search parameters may vary based 
on the size of the sequence in question. Thus, for example, a representative embodiment of 
the present invention would include an isolated polynucleotide having X contiguous 
nucleotides, wherein (i) the X contiguous nucleotides have at least about 50% identity to Y 
contiguous nucleotides derived from any of the sequences described herein, (ii) X equals Y, 
and (iii) X is greater than or equal to 6 nucleotides and up to 5000 nucleotides, preferably 
greater than or equal to 8 nucleotides and up to 5000 nucleotides, more preferably 10-12 
nucleotides and up to 5000 nucleotides, and even more preferably 15-20 nucleotides, up to 
the number of nucleotides present in the full-length sequences described herein (e.g., see the 
Sequence Listing and claims), including all integer values falling within the above-described 
ranges. 

The synthetic expression cassettes (and purified polynucleotides) of the present 
invention include related polynucleotide sequences having about 80% to 100%, greater than 
80-85%, preferably greater than 90-92%, more preferably greater than 95%, and most 
preferably greater than 98% sequence (including all integer values falling within these 
described ranges) identity to the synthetic expression cassette sequences disclosed herein (for 
example, to the claimed sequences or other sequences of the present invention) when the 
sequences of the present invention are used as the query sequence. 
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Computer programs are also available to determine the likelihood of certain 
polypeptides to form structures such as p-sheets. One such program, described herein, is the 
"ALB" program for protein and polypeptide secondary structure calculation and predication. 
In addition, secondary protein structure can be predicted from the primary amino acid 
5 sequence, for example using protein crystal structure and aligning the protein sequence 
related to the crystal structure (e.g.. using Molecular Operating Environment (MOE) 
programs available from the Chemical Computing Group Inc., Montreal, P.Q., Canada). 
. Other methods of predicting secondary structures are described, for example, in Gamier et al. 
(1996) Methods Enzymol 266:540-553; Geourjon et al. (1995) Comput. Applic. Biosci. 
10 1 1:681-684; Levin (1997) Protein Eng. 10:771-776; and Rost et al. (1993) J. Molec. Biol 
232:584-599. 

Homology can also be determined by hybridization of polynucleotides under 
conditions which form stable duplexes between homologous regions, followed by digestion 
with single-stranded-specific nuclease(s), and size determination of the digested fragments. 

1 5 Two DNA, or two polypeptide sequences are "substantially homologous" to each other when 
the sequences exhibit at least about 80%-85%, preferably at least about 90%, and most 
preferably at least about 95% : 98% sequence identity over a defined length of the molecules, 
as determined using the methods above. As used herein, substantially homologous also refers 
to sequences showing complete identity to the specified DNA or polypeptide sequence. DNA 

20 sequences that are substantially homologous can be identified in a Southern hybridization 
experiment under, for example, stringent conditions, as defined for that particular system. 
Defining appropriate hybridization conditions is within the skill of the art. See, e.g., 
Sambrook et al., supra; DNA Cloning, supra\ Nucleic Acid Hybridization, supra. 

A "coding sequence" or a sequence which "encodes" a selected protein, is a nucleic 

25 acid sequence which is transcribed (in the case of DNA) and translated (in the case of 

mRNA) into a polypeptide in vitro or in vivo when placed under the control of appropriate 
regulatory sequences. The boundaries of the coding sequence are determined by a start codon 
at the 5' (amino) terminus and a translation stop codon at the 3' (carboxy) terminus. A coding 
sequence can include, but is not limited to cDNA from viral nucleotide sequences as well as 

30 synthetic and semisynthetic DNA sequences and sequences including base analogs. A 
transcription termination sequence may be located 3' to the coding sequence. 
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"Control elements" refers collectively to promoter sequences, ribosome binding sites, 
polyadenylation signals, transcription termination sequences, upstream regulatory domains, 
enhancers, and the like, which collectively provide for the transcription and translation of a 
coding sequence in a host cell Not all of these control elements need always be present so 

5 long as the desired gene is capable of being transcribed and translated. 

A control element "directs the transcription" of a coding sequence in a cell when RNA 
polymerase will bind the promoter sequence and transcribe the coding sequence into mRNA, 
which is then translated into the polypeptide encoded by the coding sequence. 

"Operably linked" refers to an arrangement of elements wherein the components so 

10 described are configured so as to perform their usual function. Thus, control elements 

operably linked to a coding sequence are capable of effecting the expression of the coding 
sequence when RNA polymerase is present. The control elements need not be contiguous 
with the coding sequence, so long as they function to direct. the expression thereof. Thus, for 
example, intervening untranslated yet transcribed sequences can be present between, e.g., a 

1 5 promoter sequence and the coding sequence and the promoter sequence can still be 
considered "operably linked" to the coding sequence. 

"Recombinant" as used herein to describe a nucleic acid molecule means a 
polynucleotide of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of its 
origin or manipulation: (1) is not associated with all or a portion of the polynucleotide with 

20 which it is associated in nature; and/or (2) is linked to a polynucleotide other than that to 
which it is linked in nature. The term "recombinant" as used with respect to a protein or 
polypeptide means a polypeptide produced by expression of a recombinant polynucleotide. 
"Recombinant host cells," "host cells," "cells," "cell lines," "cell cultures," and other such 
terms denoting procaryotic microorganisms or eucaryotic cell lines cultured as unicellular 

25 entities, are used interchangeably, and refer to cells which can be, or have been, used as 
recipients for recombinant vectors or other transfer DNA, and include the progeny of the 
original cell which has been transfected. It is understood that the progeny of a single parental 
cell may not necessarily be completely identical in morphology or in genomic or total DNA 
complement to the original parent, due to accidental or deliberate mutation. Progeny of the 

30 parental cell which are sufficiently similar to the parent to be characterized by the relevant 
property, such as the presence of a nucleotide sequence encoding a desired peptide, are 
included in the progeny intended by this definition, and are covered by the above terms. 
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By "vertebrate subject" is meant any member of the subphylum chordata, including, 
without limitation, humans and other primates, including non-human primates such as 
chimpanzees and other apes and monkey species; farm animals such as cattle, sheep, pigs, 
goats and horses; domestic mammals such as dogs and cats; laboratory animals including 
5 rodents such as mice, rats and guinea pigs; birds, including domestic, wild and game birds 
such as chickens, turkeys and other gallinaceous birds, ducks, geese, and the like. The term 
does not denote a particular age. Thus, both adult and newborn individuals are intended to be 
covered. 

As used herein, a "biological sample" refers to a sample of tissue or fluid isolated 

10 from an individual, including but not limited to, for example, blood, plasma, serum, fecal 
matter, urine, bone marrow, bile, spinal fluid, lymph fluid, samples of the skin, external 
secretions of the skin, respiratory, intestinal, and genitourinary tracts, samples derived from 
the gastric epithelium and gastric mucosa, tears, saliva, milk, blood cells, organs, biopsies 
and also samples of in vitro cell culture constituents including but not limited to conditioned 

15 media resulting from the growth of cells and tissues in culture medium, e.g., recombinant 
cells, and cell components. 

The terms "label" and "detectable label" refer to a molecule capable of detection, 
including, but not limited to, radioactive isotopes, fluoresces, chemiluminescers, enzymes, 
enzyme substrates, enzyme cofactors, enzyme inhibitors, chromophores, dyes, metal ions, 

20 metal sols, ligands (e.g., biotin or haptens) and the like. The term "fluorescer" refers to a 
substance or a portion thereof which is capable of exhibiting fluorescence in the detectable 
range. Particular examples of labels which may be used with the invention include, but are 
not limited to fluorescein, rhodamine, dansyl, umbelliferone, Texas red, luminol, acradimum 
esters, NADPH, a-P-galactosidase, horseradish peroxidase, glucose oxidase, alkaline 

25 phosphatase and urease. 

Overview 

The present invention concerns modified Env polypeptide molecules (e.g., 
glycoprotein ("gp") 1 20). Without being bound by a particular theory, it appears that it has 
30 been difficult to generate immunological responses against Env because the CD4 binding site 
is buried between the outer domain, the inner domain and the VI /V2 domains. Thus, 
although deletion of the VI /V2 domain may render the virus more susceptible to 
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neutralization by monoclonal antibody directed to the CD4 site, the bridging sheet covering 
most of the CD4 binding domain may prevent an antibody response. Thus, the present 
invention provides Env polypeptides that maintain their general overall structure yet expose 
the CD4 binding domain. This allows the generation of an immune response (e.g., an 
antibody response) to epitopes in or near the CD4 binding site. 

Various forms of the different embodiments of the invention, described herein, may 
be combined. 

P-Sbeet Conformations 

In the present invention, location of the P-sheet structures were identified relative to 
3-D (crystal) structure of an HXB-2 crystallized Env protein (see, Example 1 A). Based on 
this structure, constructs encoding polypeptides having replacements and or excisions which 
maintain overall geometry while exposing the CD4 binding site were designed. In particular, 
the crystal structure of HXB-2 was downloaded from the Brookhaven Database. Using the 
default parameters of the Loop Search feature of the Biopolymer module of the Sybyl 
molecular modeling package, homology and fit of amino acids which could replace the native 
loops between p-strands yet maintain overall tertiary structure were determined. Constructs 
encoding the modified Env polypeptides were then designed (Example I.B.). 

Thus, the modified Env polypeptides typically have enough of the bridging sheet 
removed to expose the CD4 groove, but have enough of the structure to allow correct folding 
of the Env glycoprotein. Exemplary constructs are described below. 

Polypeptide Production 

The polypeptides of the present invention can be produced in any number of ways 
which are well known in the art. 

In one embodiment, the polypeptides are generated using recombinant techniques, 
well known in the art. In this regard, oligonucleotide probes can be devised based on the 
known sequences of the Env (e.g., gpl20) polypeptide genome and used to probe genomic or 
cDNA libraries for Env genes. The gene can then be further isolated using standard 
techniques and, e.g., restriction enzymes employed to truncate the gene at desired portions of 
the full-length sequence. Similarly, the Env gene(s) can be isolated directly from cells and 
tissues containing the same, using known techniques, such as phenol extraction and the 
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sequence further manipulated to produce the desired truncations. See, e.g.. Sambrook et al., 
supra, for a description of techniques used to obtain and isolate DNA. 

The genes encoding the modified (e.g., truncated and/or substituted) polypeptides can 
be produced synthetically, based on the known sequences. The nucleotide sequence can be 
designed with the appropriate codons for the particular amino acid sequence desired. The 
complete sequence is generally assembled from overlapping oligonucleotides prepared by 
standard methods and assembled into a complete coding sequence. See, e.g., Edge (1981) 
Nature 292:756; Nambair et al (1984) Science 223:1299; Jay et al (1984) J. Biol Chem. 
259:631 1; Stemmer et al. (1995) Gene 164:49-53. 

Recombinant techniques are readily used to clone a gene encoding an Env 
polypeptide gene which can then be mutagenized in vitro by the replacement of the 
appropriate base pair(s) to result in the codon for the desired amino acid. Such a change can 
include as little as one base pair, effecting a change in a single amino acid, or can encompass 
several base pair changes. Alternatively, the mutations can be effected using a mismatched 
primer which hybridizes to the parent nucleotide sequence (generally cDNA corresponding to 
the RNA sequence), at a temperature below the melting temperature of the mismatched 
duplex. The primer can be made specific by keeping primer length and base composition 
within relatively narrow limits and by keeping the mutant base centrally located. See, e.g., 
Innis et al, (1990) PCR Applications: Protocols for Functional Genomics; Zoller and Smith, 
Methods Enzymol (1983) 100:468. Primer extension is effected using DNA polymerase, the 
product cloned and clones containing the mutated DNA, derived by segregation of the primer 
extended strand, selected. Selection can be accomplished using the mutant primer as a 
hybridization probe. The technique is also applicable for generating multiple point 
mutations. See, e.g., Dalbie-McFarland et al. Proc. Natl Acad. Set USA (1982) 79:6409. 

Once coding sequences for the desired proteins have been isolated or synthesized, 
they can be cloned into any suitable vector or replicon for expression. As will be apparent 
from the teachings herein, a wide variety of vectors encoding modified polypeptides can be 
generated by creating expression constructs which operably link, in various combinations, 
polynucleotides encoding Env polypeptides having deletions or mutation therein. Thus, 
polynucleotides encoding a particular deleted VI /V2 region can be operably linked with 
polynucleotides encoding polypeptides having deletions or replacements in the small loop 
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region and the construct introduced into a host cell for polypeptide expression. Non-limiting 
examples of such combinations are discussed in the Examples. 

Numerous cloning vectors are known to those of skill in the art, and the selection of 
an appropriate cloning vector is a matter of choice. Examples of recombinant DNA vectors 
5 for cloning and host cells which they can transform include the bacteriophage X (£. coli), 
pBR322 (E. coli), pACYC177 (£. coli), pKT230 (gram-negative bacteria), pGVl 106 
(gram-negative bacteria), pLAFRl (gram-negative bacteria), pME290 (non-£. coli 
gram-negative bacteria), pHV14 (£. coli and Bacillus subtilis), pBD9 (Bacillus), pIJ61 
(Streptomyces), pUC6 (Streptomyces), YIp5 (Saccharomyces), YCpl9 (Saccharomyces) and 

10 bovine papilloma virus (mammalian cells). See, generally, DNA Cloning: Vols. I & II, supra; 
Sambrook et al, supra; B. Perbal, supra. 

Insect cell expression systems, such as baculovirus systems, can also be used and are 
known to those of skill in the art and described in, e.g., Summers and Smith, Texas 
Agricultural Experiment Station Bulletin No. 1555 (1987). Materials and methods for 

1 5 baculovirus/insect cell expression systems are commercially available in kit form from, inter 
alia, Invitrogen, San Diego CA ("MaxBac" kit). 

Plant expression systems can also be used to produce the modified Env proteins. 
Generally, such systems use virus-based vectors to transfect plant cells with heterologous 
genes. For a description of such systems see, e.g., Porta et al., Mol Biotech. (1996) 5:209- 

20 221; and Hackland et al., Arch. Virol. (1994) 139:1-22. 

Viral systems, such as a vaccinia based infection/transfection system, as described in 
Tomei et al., J. Virol (1993) 67:4017-4026 and Selby et al., J. Gen. Virol. (1993) 
74:1 103-1 1 13, will also find use with the present invention. In this system, cells are first 
transfected in vitro with a vaccinia virus recombinant that encodes the bacteriophage T7 RNA 

25 polymerase. This polymerase displays exquisite specificity in that it only transcribes 

templates bearing T7 promoters. Following infection, cells are transfected with the DNA of 
interest, driven by a T7 promoter. The polymerase expressed in the cytoplasm from the 
vaccinia virus recombinant transcribes the transfected DNA into RNA which is then 
translated into protein by the host translational machinery. The method provides for high 

30 level, transient, cytoplasmic production of large quantities of RNA and its translation 
product(s). 
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The gene can be placed under the control of a promoter, ribosome binding site (for 
bacterial expression) and, optionally, an operator (collectively referred to herein as "control" 
elements), so that the DNA sequence encoding the desired Env polypeptide is transcribed into 
RNA in the host cell transformed by a vector containing this expression construction. The 
coding sequence may or may not contain a signal peptide or leader sequence. With the 
present invention, both the naturally occurring signal peptides or heterologous sequences can 
be used. Leader sequences can be removed by the host in post-translational processing. See, 
e.g., U.S. Patent Nos. 4,431,739; 4,425,437; 4,338,397. Such sequences include, but are not 
limited to, the TPA leader, as well as the honey bee mellitin signal sequence. 

Other regulatory sequences may also be desirable which allow for regulation of 
expression of the protein sequences relative to the growth of the host cell. Such regulatory 
sequences are known to those of skill in the art, and examples include those which cause the 
expression of a gene to be turned on or off in response to a chemical or physical stimulus, 
including the presence of a regulatory compound. Other types of regulatory elements may 
also be present in the vector, for example, enhancer sequences. 

The control sequences and other regulatory sequences may be ligated to the coding 
sequence prior to insertion into a vector. Alternatively, the coding sequence can be cloned 
directly into an expression vector which already contains the control sequences and an 
appropriate restriction site. 

In some cases it may be necessary to modify the coding sequence so that it may be 
attached to the control sequences with the appropriate orientation; i.e., to maintain the proper 
reading frame. Mutants or analogs may be prepared by the deletion of a portion of the 
sequence encoding the protein, by insertion of a sequence, and/or by substitution of one or 
more nucleotides within the sequence. Techniques for modifying nucleotide sequences, such 
as site-directed mutagenesis, are well known to those skilled in the art. See, e.g., Sambrook 
et ah, supra; DNA Cloning, Vols. I and II, supra, Nucleic Acid Hybridization, supra. 

The expression vector is then used to transform an appropriate host cell. A number of 
mammalian cell lines are known in the art and include immortalized cell lines available from 
the American Type Culture Collection (ATCC), such as, but not limited to, Chinese hamster 
ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) cells, monkey kidney cells 
(COS), human hepatocellular carcinoma cells (e.g., Hep G2), Vero293 cells, as well as others. 
Similarly, bacterial hosts such as E. coli, Bacillus subtilis, and Streptococcus spp., will find 
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use with the present expression constructs. Yeast hosts useful in the present invention 
include inter alia, Saccharomyces cerevisiae, Candida albicans, Candida maltosa, 
Hansenula polymorpha, Kluyveromyces fragilis, Kluyveromyces lactis, Pichia guillerimondii, 
Pichia pastoris, Schizosaccharomyces pombe and Yarrowia lipolytica. Insect cells for use 
5 with baculovirus expression vectors include, inter alia, Aedes aegyptU Autographa 
californica, Bombyx mori t Drosophila melanogaster, Spodoptera frugiperda, and 
Trichoplusia ni. 

Depending on the expression system and host selected, the proteins of the present 
invention are produced by growing host cells transformed by an expression vector described 

10 above under conditions whereby the protein of interest is expressed. The selection of the 
appropriate growth conditions is within the skill of the art. 

In one embodiment, the transformed cells secrete the polypeptide product into the 
surrounding media. Certain regulatory sequences can be included in the vector to enhance 
secretion of the protein product, for example using a tissue plasminogen activator (TP A) 

15 leader sequence, a y-interferon signal sequence or other signal peptide sequences from known 
secretory proteins. The secreted polypeptide product can then be isolated by various 
techniques described herein, for example, using standard purification techniques such as but 
not limited to, hydroxyapatite resins, column chromatography, ion-exchange 
chromatography, size-exclusion chromatography, electrophoresis, HPLC, immunoadsorbent 

20 techniques, affinity chromatography, immunoprecipitation, and the like.. 

Alternatively, the transformed cells are disrupted, using chemical, physical or 
mechanical means, which lyse the cells yet keep the Env polypeptides substantially intact. 
Intracellular proteins can also be obtained by removing components from the cell wall or 
membrane, e.g., by the use of detergents or organic solvents, such that leakage of the Env 

25 polypeptides occurs. Such methods are known to those of skill in the art and are described in, 
e.g., Protein Purification Applications: A Practical Approach, (E.L.V. Harris and S. Angal, 
Eds., 1990) 

For example, methods of disrupting cells for use with the present invention include 
but are not limited to: sonication or ultrasonication; agitation; liquid or solid extrusion; heat 
30 treatment; freeze-thaw; desiccation; explosive decompression; osmotic shock; treatment with 
lytic enzymes including proteases such as trypsin, neuraminidase and lysozyme; alkali 
treatment; and the use of detergents and solvents such as bile salts, sodium dodecylsulphate, 
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Triton, NP40 and CHAPS. The particular technique used to disrupt the cells is largely a 
matter of choice and will depend on the cell type in which the polypeptide is expressed, 
culture conditions and any pre-treatment used. 

Following disruption of the cells, cellular debris is removed, generally by 
centrifiigation, and the intracellularly produced Env polypeptides are further purified, using 
standard purification techniques such as but not limited to, column chromatography, ion- 
exchange chromatography, size-exclusion chromatography, electrophoresis, HPLC, 
immunoadsorbent techniques, affinity chromatography, immunoprecipitation, and the like. 

For example, one method for obtaining the intracellular Env polypeptides of the 
present invention involves affinity purification, such as by immunoaffinity chromatography 
using anti-Env specific antibodies, or by lectin affinity chromatography. Particularly 
preferred lectin resins are those that recognize mannose moieties such as but not limited to 
resins derived from Galanthus nivalis agglutinin (GNA), Lens culinaris agglutinin (LCA or 
lentil lectin), Pisum sativum agglutinin (PSA or pea lectin), Narcissus pseudonarcissus 
agglutinin (NPA) and Allium ursinum agglutinin (AUA). The choice of a suitable affinity 
resin is within the skill in the art. After affinity purification, the Env polypeptides can be 
further purified using conventional techniques well known in the art, such as by any of the 
techniques described above. 

It may be desirable to produce Env {e.g., gpl20) complexes, either with itself or other 
proteins. Such complexes are readily produced by e.g., co-transfecting host cells with 
constructs encoding for the Env (e.g., gp!20) and/or other polypeptides of the desired 
complex. Co-transfection can be accomplished either in trans or cis, i.e., by using separate 
vectors or by using a single vector which bears both of the Env and other gene. If done using 
a single vector, both genes can be driven by a single set of control elements or, alternatively, 
the genes can be present on the vector in individual expression cassettes, driven by individual 
control elements. Following expression, the proteins will spontaneously associate. 
Alternatively, the complexes can be formed by mixing the individual proteins together which 
have been produced separately, either in purified or semi-purified form, or even by mixing 
culture media in which host cells expressing the proteins, have been cultured. See, 
International Publication No. WO 96/04301, published February 15, 1996, for a description 
of such complexes. 
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Relatively small polypeptides, i.e., up to about 50 amino acids in length, can be 
conveniently synthesized chemically, for example by any of several techniques that are 
known to those skilled in the peptide art. In general, these methods employ the sequential 
addition of one or more amino acids to a growing peptide chain. Normally, either the amino 
or carboxyl group of the first amino acid is protected by a suitable protecting group. The 
protected or derivatized amino acid can then be either attached to an inert solid support or 
utilized in solution by adding the next amino acid in the sequence having the complementary 
(amino or carboxyl) group suitably protected, under conditions that allow for the formation of 
an amide linkage. The protecting group is then removed from the newly added amino acid 
residue and the next amino acid (suitably protected) is then added, and so forth. After the 
desired amino acids have been linked in the proper sequence, any remaining protecting 
groups (and any solid support, if solid phase synthesis techniques are used) are removed 
sequentially or concurrently, to render the final polypeptide. By simple modification of this 
general procedure, it is possible to add more than one amino acid at a time to a growing 
chain, for example, by coupling (under conditions which do not racemize chiral centers) a 
protected tripeptide with a properly protected dipeptide to form, after deprotection, a 
pentapeptide. See, e.g., J. M. Stewart and J. D. Young, Solid Phase Peptide Synthesis 
(Pierce Chemical Co., Rockford, IL 1984) and G. Barany and R. B. Merrifield, The Peptides: 
Analysis. Synthesis. Biology , editors E. Gross and J. Meienhofer, Vol. 2, (Academic Press, 
New York, 1980), pp. 3-254, for solid phase peptide synthesis techniques; and M. Bodansky, 
Principles of Peptide Synthesis , (Springer- Verlag, Berlin 1984) and E. Gross and J. 
Meienhofer, Eds., The Peptides: Analysis. Synthesis. Biology , Vol. 1, for classical solution 
synthesis. 

Typical protecting groups include t-butyloxycarbonyl (Boc), 9- 
fluorenylmethoxycarbonyl (Fmoc) benzyloxycarbonyl (Cbz); p-toluenesulfonyl (Tx); 2,4- 
dinitrophenyl; benzyl (Bzl); biphenylisopropyloxycarboxy-carbonyl, t- 
amyloxycarbonyl, isobomyloxycarbonyl, o-bromobenzyloxycarbonyl, cyclohexyl, isopropyl, 
acetyl, o-nitrophenylsulfonyl and the like. 

Typical solid supports are cross-linked polymeric supports. These can include 
divinylbenzene cross-linked-styrene-based polymers, for example, divinylbenzene- 
hydroxymethylstyrene copolymers, divinylbenzene-chloromethylstyrene copolymers and 
divinylbenzene-benzhydrylaminopolystyrene copolymers. 
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The polypeptide analogs of the present invention can also be chemically prepared by 
other methods such as by the method of simultaneous multiple peptide synthesis. See, e.g., 
Houghten Proc. Natl. Acad. Sci. USA (1985) 82:5131-5135; U.S. Patent No. 4,631,21 1. 

5 Diagnostic and Vaccine Applications 

The intracellular^ produced Env polypeptides of the present invention, complexes 
thereof, or the polynucleotides coding therefor, can be used for a number of diagnostic and 
therapeutic purposes. For example, the proteins and polynucleotides or antibodies generated 
against the same, can be used in a variety of assays, to determine the presence of reactive 

10 antibodies/and or Env proteins in a biological sample to aid in the diagnosis of HIV infection 
or disease status or as measure of response to immunization. 

The presence of antibodies reactive with the Env (e.g., gpl20) polypeptides and, 
conversely, antigens reactive with antibodies generated thereto, can be detected using 
standard electrophoretic and immunodiagnostic techniques, including immunoassays such as 

15 competition, direct reaction, or sandwich type assays. Such assays include, but are not 

limited to, western blots; agglutination tests; enzyme-labeled and mediated immunoassays, 
such as ELISAs; biotin/avidin type assays; radioimmunoassays; immunoelectrophoresis; 
immunoprecipitation, etc. The reactions generally include revealing labels such as 
fluorescent, chemiluminescent, radioactive, or enzymatic labels or dye molecules, or other 

20 methods for detecting the formation of a complex between the antigen and the antibody or 
antibodies reacted therewith. 

Solid supports can be used in the assays such as nitrocellulose, in membrane or 
microtiter well form; polyvinylchloride, in sheets or microtiter wells; polystyrene latex, in 
beads or microtiter plates; polyvinylidine fluoride; diazotized paper; nylon membranes; 

25 activated beads, and the like. 

Typically, the solid support is first reacted with the biological sample (or the gpl20 
proteins), washed and then the antibodies, (or a sample suspected of containing antibodies), 
applied. After washing to remove any non-bound ligand, a secondary binder moiety is added 
under suitable binding conditions, such that the secondary binder is capable of associating 

30 selectively with the bound ligand. The presence of the secondary binder can then be detected 
using techniques well known in the art. Typically, the secondary binder will comprise an 
antibody directed against the antibody ligands. A number of anti-human immunoglobulin 
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(Ig) molecules are known in the art (e.g., commercially available goat anti-human Is or rabbit 
anti-human Ig). Ig molecules for use herein will preferably be of the IgG or IgA type, 
however, IgM may also be appropriate in some instances. The Ig molecules can be readily 
conjugated to a detectable enzyme label, such as horseradish peroxidase, glucose oxidase, 
Beta-galactosidase, alkaline phosphatase and urease, among others, using methods known to 
those of skill in the art. An appropriate enzyme substrate is then used to generate a detectable 
signal. 

Alternatively, a "two antibody sandwich" assay can be used to detect the proteins of 
the present invention. In this technique, the solid support is reacted first with one or more of 
the antibodies directed against Env (e.g., gpl20), washed and then exposed to the test sample. 
Antibodies are again added and the reaction visualized using either a direct color reaction or 
using a labeled second antibody, such as an anti-immunoglobulin labeled with horseradish 
peroxidase, alkaline phosphatase or urease. 

Assays can also be conducted in solution, such that the viral proteins and antibodies 
thereto form complexes under precipitating conditions. The precipitated complexes can then 
be separated from the test sample, for example, by centriifugation. The reaction mixture can 
be analyzed to determine the presence or absence of antibody-antigen complexes using any of 
a number of standard methods, such as those immunodiagnostic methods described above. 

The modified Env proteins, produced as described above, or antibodies to the 
proteins, can be provided in kits, with suitable instructions and other necessary reagents, in 
order to conduct immunoassays as described above. The kit can also contain, depending on 
the particular immunoassay used, suitable labels and other packaged reagents and materials 
(i.e. wash buffers and the like). Standard immunoassays, such as those described above, can 
be conducted using these kits. 

The Env polypeptides and polynucleotides encoding the polypeptides can also be used 
in vaccine compositions, individually or in combination, in e.g., prophylactic (i.e., to prevent 
infection) or therapeutic (to treat HIV following infection) vaccines. The vaccines can 
comprise mixtures of one or more of the modified Env proteins (or nucleotide sequences 
encoding the proteins), such as Env (e.g., gpl20) proteins derived from more than one viral 
isolate. The vaccine may also be administered in conjunction with other antigens and 
immunoregulatory agents, for example, immunoglobulins, cytokines, lymphokines, and 
chemokines, including but not limited to IL-2, modified IL-2 (cys!25-serl25), GM-CSF, IL- 
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12, y-interferon, IP- 1 0, MIPlp and RANTES. The vaccines may be administered as 
polypeptides or, alternatively, as naked nucleic acid vaccines {e.g., DNA), using viral vectors 
(e.g., retroviral vectors, adenoviral vectors, adeno-associated viral vectors) or non-viral 
vectors {e.g.. liposomes, particles coated with nucleic acid or protein). The vaccines may also 
5 comprise a mixture of protein and nucleic acid, which in turn may be delivered using the 
same or different vehicles. The vaccine may be given more than once {e.g., a "prime" 
administration followed by one or more "boosts") to achieve the desired effects. The same 
composition can be administered as the prime and as the one or more boosts. Alternatively, 
different compositions can be used for priming and boosting. 

10 The vaccines will generally include one or more "pharmaceutically acceptable 

excipients or vehicles" such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary 
substances, such as wetting or emulsifying agents, pH buffering substances, and the like, may 
be present in such vehicles. 

A carrier is optionally present which is a molecule that does not itself induce the 

15 production of antibodies harmful to the individual receiving the composition. Suitable 
carriers are typically large, slowly metabolized macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycollic acids, polymeric amino acids, amino acid 
copolymers, lipid aggregates (such as oil droplets or liposomes), and inactive virus particles. 
Such carriers are well known to those of ordinary skill in the art. Furthermore, the Env 

20 polypeptide may be conjugated to a bacterial toxoid, such as toxoid from diphtheria, tetanus, 
cholera, etc. 

Adjuvants may also be used to enhance the effectiveness of the vaccines. Such 
adjuvants include, but are not limited to: (1) aluminum salts (alum), such as aluminum 
hydroxide, aluminum phosphate, aluminum sulfate, etc.; (2) oil-in-water emulsion 

25 formulations (with or without other specific immunostimulating agents such as muramyl 
peptides (see below) or bacterial cell wall components), such as for example (a) MF59 
(International Publication No. WO 90/14837), containing 5% Squalene, 0.5% Tween 80, and 
0.5% Span 85 (optionally containing various amounts of MTP-PE (see below), although not 
required) formulated into submicron particles using a micro fluidizer such as Model HOY 

30 microfluidizer (Microfluidics, Newton, MA), (b) SAF, containing 10% Squalane, 0.4% 
Tween 80, 5% pluronic-blocked polymer L121, and thr-MDP (see below) either 
microfluidized into a submicron emulsion or vortexed to generate a larger particle size 
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emulsion, and (c) Ribi™ adjuvant system (RAS), (Ribi lmmunochem, Hamilton, MT) 
containing 2% Squalene, 0.2% Tween 80, and one or more bacterial cell wall components 
from the group consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), 
and cell wall skeleton (CWS), preferably MPL + CWS (Detox™); (3) saponin adjuvants, 
5 such as Stimulon™ (Cambridge Bioscience, Worcester, MA) may be used or particle 
generated therefrom such as ISCOMs (immunostimulating complexes); (4) Complete 
Freunds Adjuvant (CFA) and Incomplete Freunds Adjuvant (IF A); (5) cytokines, such as 
interleukins (IL-1 , IL-2, etc.), macrophage colony stimulating factor (M-CSF), tumor necrosis 
factor (TNF), etc.; (6) detoxified mutants of a bacterial ADP-ribosylating toxin such as a 

10 cholera toxin (CT), a pertussis toxin (PT), or an E. coli heat-labile toxin (LT), particularly 
LT-K63 (where lysine is substituted for the wild-type amino acid at position 63) LT-R72 
(where arginine is substituted for the wild-type amino acid at position 72), CT-S109 (where 
serine is substituted for the wild-type amino acid at position 109), and PT-K9/G129 (where 
lysine is substituted for the wild-type amino acid at position 9 and glycine substituted at 

15 position 129) (see, e.g., International Publication Nos. W093/13202 and W092/19265); and 
(7) other substances that act as immunostimulating agents to enhance the effectiveness of the 
composition. 

Muramyl peptides include, but are not limited to, N-acetyl-muramyl-L-threonyl-D- 
isoglutamine (thr-MDP), N-acteyl-normuramyl-L-alanyl-D-isogluatme (nor-MDP), N- 

20 acetylmuramyl-L-alanyl-D-isogluatminyl-L-alanine-2-(r-2 , -dipalmitoyl-5n-glycero-3- 
huydroxyphosphoryloxy)-ethylamine (MTP-PE), etc. 

Typically, the vaccine compositions are prepared as injectables, either as liquid 
solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles 
prior to injection may also be prepared. The preparation also may be emulsified or 

25 encapsulated in liposomes for enhanced adjuvant effect, as discussed above. 

The vaccines will comprise a therapeutically effective amount of the modified Env 
proteins, or complexes of the proteins, or nucleotide sequences encoding the same, and any 
other of the above-mentioned components, as needed. By "therapeutically effective amount" 
is meant an amount of a modified Env (e.g., gpl20) protein which will induce a protective 

30 immunological response in the uninfected, infected or unexposed individual to which it is 
administered. Such a response will generally result in the development in the subject of a 
secretory, cellular and/or antibody-mediated immune response to the vaccine. Usually, such 
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a response includes but is not limited to one or more of the following effects; the production 
of antibodies from any of the immunological classes, such as immunoglobulins A, D, E, G or 
M; the proliferation of B and T lymphocytes; the provision of activation, growth and 
differentiation signals to immunological cells; expansion of helper T cell, suppressor T cell, 
5 and/or cytotoxic T cell. 

Preferably, the effective amount is sufficient to bring about treatment or prevention of 
disease symptoms. The exact amount necessary will vary depending on the subject being 
treated; the age and general condition of the individual to be treated; the capacity of the 
individual's immune system to synthesize antibodies; the degree of protection desired; the 

1 0 severity of the condition being treated; the particular Env polypeptide selected and its mode 
of administration, among other factors. An appropriate effective amount can be readily 
determined by one of skill in the art. A "therapeutically effective amount" will fall in a 
relatively broad range that can be determined through routine trials. 

Once formulated, the nucleic acid vaccines may be accomplished with or without viral 

1 5 vectors, as described above, by injection using either a conventional syringe or a gene gun, 
such as the Accell® gene delivery system (PowderJect Technologies, Inc., Oxford, England). 
Delivery of DNA into cells of the epidermis is particularly preferred as this mode of 
administration provides access to skin-associated lymphoid cells and provides for a transient 
presence of DNA in the recipient. Both nucleic acids and/or peptides can be injected either 

20 subcutaneously, epidermally, intradermally, intramucosally such as nasally, rectally and 
vaginally, intraperitoneally, intravenously, orally or intramuscularly. Other modes of 
administration include oral and pulmonary administration, suppositories, needle-less 
injection, transcutaneous and transdermal applications. Dosage treatment may be a single 
dose schedule or a multiple dose schedule. Administration of nucleic acids may also be 

25 combined with administration of peptides or other substances. 

While the invention has been described in conjunction with the preferred specific 
embodiments thereof, it is to be understood that the foregoing description as well as the 
examples which follow are intended to illustrate and not limit the scope of the invention. 
Other aspects, advantages and modifications within the scope of the invention will be 

30 apparent to those skilled in the art to which the invention pertains. 
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Experimental 

Below are examples of specific embodiments for carrying out the present invention. 
The examples are offered for illustrative purposes only, and are not intended to limit the 
scope of the present invention in any way. 
5 Efforts have been made to ensure accuracy with respect to numbers used (e.g., 

amounts, temperatures, etc.), but some experimental error and deviation should, of course, be 
allowed for. 



Example 1 

10 A.l. Best-Fit and Homology Searches 

The crystal structure of HXB-2 gp 120 was downloaded from the Brookhaven 
database (COMPLEX (HIV ENVELOPE PROTEIN/CD4/FAB) 1 5- JUN-98 1 GC 1 
TITLE: HIV-1 GP120 CORE COMPLEXED WITH CD4 AND A NEUTRALIZING 
HUMAN ANTIBODY). Beta strands 3, 2, 21, and 20 of gp 120 form a sheet near the CD4 

15 binding site. Strands P-3 and P-2 are connected by the VI /V2 loop. Strands p-21 and P-20 
are connected by another small loop. The H-bonds at the interface between strands P-2 and 
P-21 are the only connection between domains of the "lower" half of the protein (joining 
helix alpha 1 to the CD4 binding site). This beta sheet and these loops mask some antigens 
(e.g., antigens which may generate neutralizing antibodies) that are only exposed during the 

20 CD4 binding. 

Constructs that remove enough of the beta sheet to expose the antigens in the CD4 
binding site, but leave enough of the protein to allow correct folding were designed. 
Specifically targeted were modifications to the small loop and, optional deletion of the V1/V2 
loops. Three different types of constructs were designed: (1) constructs encoding 

25 polypeptides that leave the number of residues making up the entire 4-strand beta sheet intact, 
but replace one or more residues; (2) constructs that encode polypeptide having at least one 
residue of at least one beta strand excised or (3) constructs encoding polypeptides having at 
least two residues of at least one beta strand excised. Thus, a total of 6 different turns were 
needed to rejoin the ends of the strands. 

30 Initially, residues in the small loop (residues 427-430, relative to HXB-2) and 

connected beta strands (P-20 and P-21) were modified to contain Gly and Pro (common in 
beta turns). These sequences were then used as the target to match in each search. The 
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geometry of the target was matched to known proteins in the Brookhaven Protein Data Bank. 
In particular, 5-residue turns (including an overlapping single residue at the N-terminal, the 2 
residue target turn and 2 overlapping residues at the C-terminal) were searched in the 
databases. In other words, these modified loops add a 2 residue turn that should be able to 
5 support a geometry that will maintain the beta-sheet structure of the wild type protein. The 
calculations were performed using the default parameters in the Loop Search feature of the 
Biopolymer module of the Syby 1 molecular modeling package. In each case, the 25 best fits 
based on geometry alone were reviewed and, of those, several selected for homology and fit. 
In addition, it was also determined what modifications could be made to remove most 

10 of the V1/V2 loop (residues 124-198, relative to HXB-2) yet leave the geometry of the 

protein intact. As with the small loop, constructs were also designed which excised one or 
more residues from the p-2 strand (residues 1 19-123 of HXB-2), the p-3 strand (residues 199- 
201 of HXB-2) or both P-2 and p-3. For these constructs, known loops were searched to 
match the geometry of a pentamer (including two remaining residues from the N-terminal 

15 side, a 2 residue turn and 1 C-terminal residue). For these searches, Gly-Gly was preferred as 
the insert along with at least one C-terminal substitution. 

A. 2. Small Loop Replacements 

In one aspect, the native sequence was replaced with residues that expose the CD4 
20 binding site, but leave the overall geometry of the protein relatively unchanged. For the 
small loop replacements, the target to match was: ASN425-MET426-GLY427-GLY428- 
GLY43 1 . Results of the search are summarized in Table 1 . 

Table 1 : Search of Small Loop (Asn425 through Gly431) 

25 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No. 


Best fit 


LYS-ASP-SER-ASN-ASN 


0.16689 


62.5 


27 


3 


TYR-GLY-LEU-GLY-LEU 


0.220308 


62.5 


28 


4 


GLU-ARG-GLU-ASP-GLY 


0.241754 


62.5 


29 


7 


ARG-LYS-GLY-GLY-ASN 


0.24881 


100 


30 


12 


TRP-THR-GLY-SER-TYR 


0.26417 


83.33 


31 
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Based on these results, constructs encoding Gly-Gly (#7), Gly-Ser (#12) or Gly-Gly- 
Asn (#7) were recommended. 

As VI /V2 and one or more residues of P-2 and P-3 are also optionally deleted in the 
modified polypeptides of the invention, known loops to match the geometry of the VW2 
loop were also searched. The V1/V2 loop the target to match was: Lysl21-Leu-122-Glyl23- 
Glyl24-Serl99. Some notable matches are shown in Table 2: 



Table 2: Search of V1/V2 loop (Lysl21 through Serl99) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id. No. 


Best fit 


GLN-VAL-HIS-ASP-GLU 


0.154764 


68.75 


32 


2 


LYS-GLU-GLY-ASP-LYS 


0.15718 


81.25 


33 


9 


ARG-SER-GLY-ARG-SER 


0.173731 


68.75 


34 


11 


THR-LEU-GLY-ASN-SER 


0.175554 


81.25 


35 


16 


HIS-PHE-GLY-ALA-GLY 


0.178772 


93.75 


36 



Based on these searches, constructs encoding Gly-Asn in place ofVW2 were 
recommended. 

A.3. One Additional Residue Excisions 

For a slightly truncated small loop, one more residue was trimmed from each beta 
strand to slightly shorten the beta sheet. The target to match was: ILE424-ASN425- 
GLY426-GLY427-LYS432. Results are shown in Table 3: 



Table 3: Search of Beta sheet shortened by One residue (Ile424 through Lys432) 



Rank . 


Sequence 


RMSD 


% Homology 


Seq Id No. 


Best fit: 


ARG-MET-ALA-PRO-VAL 


0.316805 


58.33 


37 


Best 
horn: 


ASP-SER-ASP-GLY-PRO 


0.440896 


83.33 


38 
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Although these searches showed more variation and worse fits than the previous 
truncation, the Pro-Val or Pro-Leu encoding constructs were very similar. Accordingly, Ala- 
Pro encoding constructs were recommended. 

Sequences encoding gpl20 polypeptides having VI /V2 deleted and an additional 
residue from P-2 or p-3 excised were also searched. The V1/V2 loop the target to match was: 
VAL120-LYS121-GLY122-GLY123-VAL200. Some notable matches are shown in Table 4. 



Table 4: Search of V1/V2 loop (Vall20 through Val200) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


THR-VAL-ASP-PRO-TYR 


0.400892 


58.33333 


39 


2 


SER-THR-ASN-PRO-LEU 


0.402575 


54.16667 


40 


3 


THR-ARG-SER-PRO-LEU 


0.403965 


58.33333 


41 


7 


ARG-MET-ALA-PRO-VAL 


0.440118 


58.33333 


42 



The construct encoding Ala-Pro (e.g.. #7) was recommended. 
A.4. Further Excisions 

In yet another truncation, an additional residue was trimmed from the (i-20 and p-21 
strands to further shorten the beta sheet. The target to match was ILE423-ILE424-GLY425- 
GLY426-ALA433. Notable matches are shown in Table 5. 



Table 5: Search of Beta sheet shortened by Two Residues (Ile423 through Ala433) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


THR-TYR-GLU-GLY-VAL 


0.130107 


79.16666 


43 


2 


GLN-VAL-GLY-ASN-THR 


0.138245 


79.16666 


44 


3: 


THR-VAL-GLY-GLY-ILE 


0.153362 


100 


45 



A construct encoding Gly-Gly (e.g., #3), which has 100% homology, was 
recommended. 
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Also searched were sequences encoding a deleted V1/V2 region and at least two 
residues excised from p-2, 0-3 or at least one residue excised from P-2 and £-3. The target to 
match was: C YS 1 19-VAL1 20-GLY 1 2 1 -GL Y 1 22-ILE20 1 . Notable matches are shown in 
Table 6. 



Table 6: Search of V1/V2 loop (Cysl 19 through Ile201) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


ASP-LEU-PRO-GLY-CYS 


0.250501 


75 


46 


4 


ASP-VAL-GLY-GLY-LEU 


0.290383 


100 


47 



It was determined that both constructs would be used. 
B. 1 . Constructs encoding modified Env polypeptides 

As described above, the native loops extruding from the 4-P antiparallel-stands were 
excised and replaced with 1 to 3 residue turns. The loops were replaced so as to leave the 
entire P-strands or excised by trimming one or more amino acid from each side of the 
connected strands. The ends of the strands were rejoined with 

turns that preserve the same backbone, geometry (e.g., tertiary structure of 0-20 and P-21), as 
determined by searching the Brookhaven Protein Data Bank. 

Table 7A is a summary of the truncations of the variable regions 1 and 2 
recommended for this study, as determined in Example l.A. above. 
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VI /V2 Modifications 


SEQ ID NO 


Figure 


-LEU1 22-GLY-ASN-SER1 99 


7 


10 


-LYS 1 2 1 -ALA-PRO-V AL200- 


6 


9 


-VAL120-GLY-GLY-ILE201- 


4 


7 


-VAL120-PRO-GLY-ILE201B- 


5 


8 


-V AL1 20-GLY-ALA-GLY-ALA204- 


3 


6 


-VAL 1 20-GLY-GLY-ALA-THR202- 


8 


11 


-V AL 1 2 7-GLY-AL A-GLY-ASN 195- 


25 


28 



As previously noted, the polypeptides encoded by the constructs of the present 
invention are numbered relative to HXB-2, but the particular amino acid residue of the 
polypeptides encoded by these exemplary constructs is based on SF-162. Thus, for example, 
although amino acid residue 195 in HXB-2 is a serine (S), constructs encoding polypeptides 
having then wild type SF162 sequence will have an asparagine (N) at this position. Table 7B 
shows just three of the variations in amino acid sequence between strains HXB-2 and SF162. 
The entire sequences, including differences in residue and amino acid number, of HXB-2 and 
SF1 62 are shown in the alignment of Figure 2 (SEQ ID NOs: 1 and 2). 



Table 7B 



HXB-2 amino 
acid number 


HXB-2 Residue 


SF162 Residue/amino acid number 


128 


Serine (S) 


Thr (T)/114 


195 


Serine (S) 


Asn (N)/188 


426 


Met(M) 


Arg (R)/411 



Constructs containing deletions in the P-20 strand, 0-21 stand and small loop were 
also constructed. Shown in Table 8 are constructs encoding truncations in these regions. The 
constructs in Table 8 are numbered relative to HXB-2 but the unmodified amino acid 
sequence is based on SF162. Thus, the construct encodes an arginine (Arg) as is found in 
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SF162 in the amino acid position numbered 426 relative to HXB-2 (See, also, Table 7B). 
Changes from wildtype (SF162) are shown in bold in Table 8B. 



Table 8 



Small Loop/p-20 and p-21 (Modified) 


SEQ ID NO 


Figure 


-TRP42 7-GL Y-GLY43 1 - 


9 


12 


-ARG426-GLY-GLY-GLY43 1 - 


10 


13 




1 I 
1 1 


14 


-ARG426-GLY-GLY-ASN-LYS432- 


12 


15 


-ASN425-ALA-PRO-LYS432- 


13 


16 


-ILE424-GLY-GLY-ALA433- 


14 


17 


-ILE423-GLY-GLY-MET434- 


15 


18 


GLN422-GLY-GLY-TYR43 5- 


16 


19 


-GLN422-ALA-PRO-TYR435B- 


17 


20 



15 

The deletion constructs shown in Tables 7 and 8 for each one of the P-strands and 
combinations of them are constructed. These deletions will be tested in the Env forms gpl20, 
gpl40 and gpl60 from different HTV strains like subtype B strains (e.g., SF162, US4, SF2), 
subtype E strains (e.g., CM235) and subtype C strains (e.g., AF1 10968 or AF1 10975). 
20 Exemplary constructs for SF162 are shown in the 

Figures and are summarized in Table 9. As noted above in Figure 2 and Table 7B, in the 
bridging sheet region, the amino acid sequence of SF162 differs from HXB-2 in that the 
Met426 of HXB-2 is an Arg in SF162. In Table 9, V1/V2 refers to deletions in the V1/V2 
region; # bsm refers to a modification in the bridging sheet small loop. 

25 



Table 9 


Construct 


Seq. Id. 


Fig. 


Modification/Amino acid sequence 


Vall20-Ala204 


3 


6 


V1/V2: Vall20-Gly-Ala-GIy-Ala204 


Vall20-IIe201 


4 


7 


V1/V2: Vall20-G!y-Gly-Ile201 


Vall20-llc201B 


5 


8 


VI/V2: Vall20-Pro-Gly-Ile201 


Lysl21-VaI200 


6 


9 


VI /V2: Lysl21-Ala-Pro-Val200 
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Table 9 


Construct 


Seq. Id. 


Fig. 


Modification/ Amino acid sequence 


Leul22-Serl99 


7 


10 


V1/V2: Leul22-Gly-Asn-Serl99 


ValI20-Thr202 


8 


11 


V1/V2: Vall20-Gly-Gly-Ala-Thr202 


Trp427-Gly431 


9 


12 


bsm:Trp427-Gly-Gly431 


Arg426-Gly431 


10 


13 


bsm: Arg426-Gly-Gly-Gly431 


Arg426-Gly431B 


11 


14 


bsm: Arg426-Gly-Ser-Gly43 1 


Arg426-Lys432 


12 


15 


bsm: Arg426-Gly-Gly-Asn-Lys432 


Asn425-Lys432 


13 


16 


bsm: Asn425-AIa-Pro-Lys432 


llC*rZ*T~/\ld*t j J 


14 


17 


bsm: Ile424-GIy-Gly-Ala433 


Ile423-Met434 


15 


18 


bsm: IIe423-Gly-Gly-Met434 


Gln422-Tyr435 


16 


19 


bsm: Gln422-Gly-GIy-Tyr435 


Vall27-Asnl95 


25 


28 


bsm: Vall27-Gly-Ala-Gly-Asnl95 


Gln422-Tyr435B 


17 


20 


bsm: Gln422-Ala-Pro-Tyr435 


Leul22-Serl99; 
Arg426-Gly431 


18 


21 


VW^sm: Leul22-Gly-Asn-Serl99 — Arg426- 
Gly-G)y-Gly431 


Leul22-Serl99; 
Arg426-Lys432 


19 


22 


Vl/V2/bsm: Leul22-Gly-Asn-Serl99 — Arg426- 
G1y-Gly-Asn-Lys432 


Leul22-Serl99-Trp427- 
Giy431 


20 


23 


Vl/V2/bsm: Leul22-Gly-Asn-Serl99 — Trp427- 
Gly-Gly431 


Lysl21-Val200- 
Asn425-Lys432 


21 


24 


Vl/V2/bsm: Lysl21-A!a-Pro-Val200 — Asn425- 
Ala Pro-Lys432 


Val 1 20-Ile20 1 -IIc424- 
Ala433 


22 


25 


Vl/V2/bsm: Vall20-Gly-Gly-Ile201 — Ile424- 
Gly-Gly-Ala433 


VaN20-Ile20IB-Ile424- 
Ala433 


23 


26 


Vl/V2/bsm: Vall20-Pro-Gly-Ile201 — Ile424- 
Gly-G1y-Ala43 


Vall20-Thr202; Ile424- 
Ala433 


24 


27 


Vl/V2/bsm: Vall20-Gly-Gly-Ala-Thr202 ~ 
Ile424-Gly-Gly-Ala433 


Vall27-Asnl95; 
Arg426-Gly431 


25 


29 


Vl/V2/bsm: VaI127-Gly-Ala-Gly-Asnl95 — 
Arg426-Giy-Gly-Gly431 



Combinations of V 1/V2 deletions and bridging sheet small loop modifications in 
addition to those specifically shown in Table 9 are also within the scope of the present 
invention. Various forms of the different embodiments of the invention, described herein, 
may be combined. 
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The first screening will be done after transient expression in COS-7, RD and/or 293 
cells. The proteins that are expressed will be analyzed by immunoblot, ELISA, and for 
binding to mAbs directed to the CD4 binding site and other important epitopes on gpl20 to 
determine integrity of structure. They will also be tested in a CD4 binding assay and. in 
addition, the binding of neutralizing antibodies, for example using patient sera or mAb 448D 
(directed to Glu370 and Tyr384, a region of the CD4 binding groove that is not altered by the 
deletions). 

The immunogenicity of these novel Env glycoproteins will be tested in rodents and 
primates. The structures will be administered as DNA vaccines or adjuvanted protein 
vaccines or in combined modalities. The goal of these vaccinations will be to archive broadly 
reactive neutralizing antibody responses. 
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What is claimed is: 

1 . A polynucleotide encoding a modified HIV Env polypeptide wherein the 
polypeptide has at least one amino acid deleted or replaced in the region corresponding to 
residues 420 to 436 relative to HXB-2 (SEQ ID NO: 1 ). 

2. The polynucleotide of claim 1, wherein the region corresponding to residues 124- 
198 relative to HXB-2 is deleted and at least one amino acid is deleted or replaced in the 
regions corresponding to the residues 1 19 to 123 and 199 to 210 relative to HXB-2 (SEQ ID 
NO:l). 

3. The polynucleotide of claim 1, wherein at least one amino acid in the region 
corresponding to residues 427 through 429 relative to HXB-2 (SEQ ID NO:l) is deleted or 
replaced. 

4. The polynucleotide of claim 2, wherein at least one amino acid of the in the region 
corresponding to residues 427 through 429 relative to HXB-2 (SEQ ID NO:l) is deleted or 
replaced. 

5. The polynucleotide of claim 1 , wherein the amino acid sequence of the modified 
HIV Env polypeptide is based on strain SF162. 

6. An immunogenic modified HIV Env polypeptide having at least one amino acid 
deleted or replaced in the region corresponding to residues 420 through 436, relative to HXB- 
2 (SEQ ID NO: 1). 

7. The polypeptide of claim 6, wherein one amino acid is deleted in the region 
corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO:l). 
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8. The polypeptide of claim 6, wherein more than one amino acid is deleted in the 
region corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO:l). 



9. The polypeptide of claim 6, wherein at least one amino acid is replaced in the 
5 region corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO:l). 

10. The polypeptide of claim 6, wherein at least one amino acid residue between 
about amino acid residue 427 and amino acid residue 429 relative to HXB-2 (SEQ ID NO:l) 
is deleted or replaced. 

10 

1 1 . The polypeptide of claim 6, wherein the VI and V2 regions of the polypeptide are 
truncated. 

12. The polypeptide of claim 10, wherein the VI and V2 regions of the polypeptide 
15 are truncated. 

13. The polypeptide of claim 6, wherein the amino acid sequence of the modified 
HIV Env polypeptide is based on strain SF162. 

20 14. A construct comprising the nucleotide sequence depicted in Figure 6 (SEQ ID 

NO:3). 

1 5. A construct comprising the nucleotide sequence depicted in Figure 7 (SEQ ID 

NO:4). 

25 

16. A construct comprising the nucleotide sequence depicted in Figure 8 (SEQ ID 

NO:5). 

1 7. A construct comprising the nucleotide sequence depicted in Figure 9 (SEQ ID 

30 NO:6). 
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18. A construct comprising the nucleotide sequence depicted in Figure 10 (SEQ ID 

. NO:7). 

19. A construct comprising the nucleotide sequence depicted in Figure 1 1 (SEQ ID 

5 NO:8). 

20. A construct comprising the nucleotide sequence depicted in Figure 12 (SEQ ID 

NO:9). 

10 21 . A construct comprising the nucleotide sequence depicted in Figure 13 (SEQ ID 

NO: 10). 

22. A construct comprising the nucleotide sequence depicted in Figure 14 (SEQ ID 
NO:ll). 

15 

23. A construct comprising the nucleotide sequence depicted in Figure 15 (SEQ ID 
NO: 12). 

24. A construct comprising the nucleotide sequence depicted in Figure 16 (SEQ ID 
20 NO: 13). 

25. A construct comprising the nucleotide sequence depicted in Figure 1 7 (SEQ ID 
NO: 14). 

25 26. A construct comprising the nucleotide sequence depicted in Figure 1 8 (SEQ ID 

NO: 15). 

27. A construct comprising the nucleotide sequence depicted in Figure 19 (SEQ ID 
NO: 16). 

30 

28. A construct comprising the nucleotide sequence depicted in Figure 20 (SEQ ID 
NO: 17). 
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29. A construct comprising the nucleotide sequence depicted in Figure 21 (SEQ ID 
NO: 18). 

30. A construct comprising the nucleotide sequence depicted in Figure 22 (SEQ ID 
5 NO: 19). 

31 . A construct comprising the nucleotide sequence depicted in Figure 23 (SEQ ID 
NO:20). 

10 32. A construct comprising the nucleotide sequence depicted in Figure 24 (SEQ ID 

NO:21). 

33. A construct comprising the nucleotide sequence depicted in Figure 25 (SEQ ID 
NO:22). 

15 

34. A construct comprising the nucleotide sequence depicted in Figure 26 (SEQ ID 
NO:23). 

35. A construct comprising the nucleotide sequence depicted in Figure 27 (SEQ ID 
20 NO:24). 

36. A construct comprising the nucleotide sequence depicted in Figure 28 (SEQ ID 
NO:25). 

25 37. A construct comprising the nucleotide sequence depicted in Figure 29 (SEQ ID 

NO:26). 

38. A vaccine composition comprising a polynucleotide encoding a modified Env 
polypeptide according to any one of claims 1-5. 

30 

39. A vaccine composition comprising a polynucleotide construct encoding a 
modified Env polypeptide according to any of claims 14-37. 
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40. A vaccine composition comprising a modified Env polypeptide according to any 
of claims 6-13. 



41. The vaccine composition of any of claims 38-40, further comprising an adjuvant. 

42. A method of inducing an immune response in subject comprising, administering a 
polynucleotide according to any one of claims 1-5 in an amount sufficient to induce an 
immune response in the subject. 

43. A method of inducing an immune response in subject comprising, administering a 
polynucleotide construct according to any one of claims 14-37 in an amount sufficient to 
induce an immune response in the subject. 

44. A method of inducing an immune response in a subject comprising administering 
a composition comprising a modified Env polypeptide according to any one of claims 6-13, 
wherein the composition is administered in an amount sufficient to induce an immune 
response in the subject 

45. The method of any of claims 42-44 further comprising administering an adjuvant 
to the subject. 

46. A method of inducing an immune response in a subject comprising 

(a) administering a first composition comprising a polynucleotide according to any of 
claims 1-5 in a priming step and 

(b) administering a second composition comprising a modified Env polypeptide 
according to any of claims 6-13, as a booster, in an amount sufficient to induce an immune 
response in the subject. 

47. The method of claim 46 wherein the first composition or second composition 
further comprise an adjuvant. 
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48. The method of claim 46 wherein the first and second compositions further 
comprise an adjuvant. 
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1 • 50 

HXB2 (1) MgVK— EKKQHLWRHgWRWGTBLLBBLMIC-SgTEKj^^B^B^ 

162 (1) MDAMpgLCCVLLLC^FgSPs|vE KBWaia^j^aS} K 

SF2 (1) MgVKGTRRNgQHLWRWg TLLI ^LM IC-s|TE KBP^fflA^^yi^K 

CM236 (1) MRVKETQMNBPNLWgWg TLBL ^M lC-SESNN EHiaSiiyft^J^ ^ 

US4 (1) --Eg™ KHCQHLWRGg ILLLg5LMIC-RgTTVggg^j^^23E K 

Consensus (1) MRVK YQHLWRWG TLLLGMLMIC SATEKLWVTVYYGVPVWK 



51 • • 100 

HXB2 (4 7) EBT l gSSBEffiggg ^ Y DTg V ^ ffl A^flf WN ^^VVfiVE^^^^ 

162 (41) EjfofngagqgafiffiKfiYDTBv ttt^^ 

sf2 (4 6) EfiT E^ia^flsaRSg y DTBv BSBBBE RWfflffiiHi^v ^GE 

CM236 (4 6) EP SSgBBSBSS K^ 

US 4 (41) ^MIW^W^^ 

Consensus (51) EATTTLFCASDAKAYDTEVHNVWATHACVPTDPNPQEWL NVTENFNMW 

101 *>l • • 150 

HXB2 (97) EgD Bj^ Hi^l lL^^ ■ 

162 (9i) Ep EEill Hlgi Ka^^ 

SF2 (96) BgN gggg^ Q Eaill^^ 

CM236 (96) f&N BS3^ Q )b&M^ 

US 4 (91) IBN HgE^ H^l Hmfe^^ 

Consensus (101) KNNMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCTDL 



HXB2 
162 
SF2 
CM236 
US 4 

Consensus 



HXB2 
162 
SF2 
CM236 
US4 

Consensus 



151 • ^ 200 

(135) KNDTNTNgSSGgMI@EgG|lK^^NI^SlgG^^EYAFiY 

(129) KNpTNTB<|sNV^EMD-pGBlK ^^ K|{TESI^^^EYAI^Y 

( 134 ) GKgTNTNjsNWgEEE-iG §IKtS^ NITBSSlii aag8 ENALiR 

( 135 ) LTNVNNIT|VSNTIGNITDSI^^NBTiE^[^K^VgALgY 

(141) GTNSTSGTNpTSTN^DSWEKfiPEGBlK ^^ NITES^gDgS^EYBLBY 
(151) NATNTNSS KE M KGEI KNCSFNI TTS I RDKVQKEYALFY 



250 



201 

(178) kJ^Sdndtts E^TS^T Bgfi T BA^iav ^El 

(171) KBgWgjDNDNTS BEBl l^T BSSl T ^SS^ SS^ 

(17 6) NjgW^DNASTlTKYTK|R|lH^RQQT^^^V^E{ 

(179) K fPjB vHBDNKTS--- S 

(191) KggVV|g]DNDNAS gRglN^TBgjTEMSVHSE; 

(201) KLDVVPIDND TS YRLINCNTSVITQACPKVSFEPIPIHYCAPAG 



HXB2 
162 
SF2 
CM236 
US4 

Consensus 



251 

(223) F EBgEg NNB TPKgTEgB TpgBTi 
(216) Fgj3Bl N D§Kg^S WfflTl 
(226) F Bagj3 NNBTH3BKpMT^BT| 
(226) BS^SNDE^^TI 




(236) f 1S53S5S kdBkEH5tE1^kH^ 

(251) FAILKCNDK FNGTGPCTNVSTVQCTHGIRPWSTQLLLNGSLAEEEVVI 



301 • 350 

HXB2 (273) ^VgFED BM^ C^NTgSE^TigN Kl^ KRgR|QR^|RAgVTlgK 

162 (266) ^Egpf D BSB^^ QBKEEEBP ^ T^ ^^ KSBTg- -^RAEYATBD 

SF2 (276) ^I^FfflNB^Sifi^QBNEEgAEB T ^ N ^^^SEYi- -gggRA^TTBR 

CM236 (276) EBEBLfflN HS^^^j H§NKgSEEBTEBS ^5ER TSBTp — ^jgQVEYRT§D 

US 4 (286) ^EBFffiD Ba^^ 0BNE^E^BlK5NB ^ KSEHi--^SRAfeYATBD 

Consensus (301) RSENFTDNAKTIIVQLNESVEINCTRPNNNTRKSI I GPGRAFY TGD 
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351 e 400 

HXB2 (323) ff-BNpQ^BNMStUO^NTCKBlASraREQEGN Bg^lE KQS B^ P^ 

162 (314) ElgDlBQgHnNgSGEK§NNTiKglVT^QAQ|G- ^^^K QS gm3 Pg 

SF2 (324) |liDI§K§HiNBS^Q§NNTBEilVKEREQiGN BagS SBQSBB8P^ 

CM236 (324) iElBDlRK^E^GTKiNEVl^^TE^EHiN- H^ I^PP gaSB LB 

US4 (334) ilBDlROpHBNSsRANRTNTDEPlVE^REoiGNEB^lESSS BBCT PEf 

Consensus (351) IIGDIRQAHCNISRAKWNNTL QIV KLREQFGNNKTIIFNQSSGGDJ'EI 



HXB2 
162 
SF2 
CM236 
US4 

Consensus 



401 • • »450 

( 372 ) VTBs BBS G ^jffi^ 

(363) VMHS ^ C ^^ajg TQpESSfW-NN glGPNNTNG— ^T^f^K 

(374) VMgS B^g R Ea^Bai TT C^BBg NBWRLN- -HffiEG TKGNDBgl BggRj jK 

(373) TMSh ESB r P^^^W TTREER^CI EN — GEMG — GCNG — l^iPPB^SilK 
(384 ) VFfs S5^ G E5B^3a9 TBQ^S[SgW--N--liEEVNKTKEND^lBroRl^ 



(401) VMHSFNCGGEFFYCNTTQLFNSTW N TEG N T G DTIILPCRIK 



HXB2 
162 
SF2 
CM236 
US 4 

Consensus 



451 X • 500 

(422) 6I13&5 mBBkvBK ^ NSNN gg83EE 

(407) ^^ REEVgK ^Mg RiQg^S ^a L g^^ K— EjSNTT^ 

(419) B^ ^EVlK fflffi^mfl GEQBS|s B^pLB^^ T--N^NgTgg 

(417) E^ M^GAB ^^^^j siRjN§v Baa^^^g - " A|NTTN|T| 

(430) BS^ M^EVBK gBBSa^ Rg^ 

(451) QIINMWQEVGKAMYAPPI GQIRCSSNITGLLLTRDGG NITNDTEIF 



HXB2 
162 
SF2 
CM236 
US4 

Consensus 
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(469) ^^ DMB ^SmMS^^ ^ ^^^ ^ ^SSS^Q 1 ^^^ 1 -Eg 

(455) Bg^ DMR ftUy^^fifg^ 

(467) ffB ffiffa DM RfffflSEWiB^^ IVjS 

(480) BSSaB^ M^^NW^^ 



(501) RPGGGDMRDNWRSELYKYKWKlEPLGVAPTKAKRRVVQREKRAVGI GA 



HXB2 
162 
SF2 
CM236 
US 4 

Consensus 



551 

(518) Bfl! 
(504) MFL| 
(517) MFL| 
(513) MI 
(529) QFj 



600 



(551) MFLGFLGAAGSTMGAASLTLTVQARQLLSGIVQQQNNLLRAIEAQQHLLQ 



HXB2 
162 
SF2 
CM236 
US4 

Consensus 



601 
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i aa^as LE^A^^AsBSiK 

Qi ^lKBgaGKlJB^ A ^^ASBSRK 
kEBql>E5i|E^EBl^5!t^S5sBHHk 



(568) 
(554) 
(567) 
(563) 
(579) 

( 601 ) LTVWGIKQLQARVLAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNK 
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700 



CM236 (613) PYEESBNNMlBBEBEEggiSESSNQgYE^TE SBHBBIBgEB Kt 
(629) §LTHaDNMEgMEBEB3giGB^GLiYNLIElBBB^EK^QE 
Consensus (651) S LEE I WNNMTWMEWE RE I NYTNLIYTLIEESQNQQEKNEQELLELDKWA 



HXB2 


(618) 


162 


(604) 


SF2 


(617) 


CM236 


(613) 


US 4 


(629) 


Consensus 


(651) 


HXB2 


(668) 


162 


(654) 


SF2 


(667) 


CM236 


(663) 


US 4 


(679) 


Consensus 


(701) 



701 



750 



KlBlHBvEEBvBBBlvEfil 

DigTKEBasiKiBsam^MraAi 



L|F 

F 



(679) ssB^DETNEs^^i^r^ss^vBAS^es^as^s^L 

Consensus (701) SLWNWFDITNWLWYIKIFIMIVGGLVGLRIVFAVLSIVNRVRQGYSPLSF 



HXB2 
162 
SF2 
CM236 
US4 

Consensus 



751 «800 

(718) ^HLPTP|G^|PEG^EgffiRD^^Rj^N@S^lgDp^|^^ 

(704) BSrfpapSgE^pegBSeEBBrd BSEBSs P BHfl L^gi iBD| 

(717} pfflRLPVP§G^P0G8^EEffiRD BIEB ^D§F^Bir 
(713) pPFHHC^lEBBggSERBBGBBEQG BB^l REBsBF^ABDg 
(729) ESBRLPAQgGBEEPECaBSE^RDBfflSNRBiHBLESBllDB 

(751) QTRLP PRGPDRPEGIEEEGGERDRDRSVRLV G LALIWDDLRSLCLFS 



HXB2 
162 
SF2 
CM236 
US4 

Consensus 



HXB2 
162 
SF2 
CM236 
US4 

Consensus 



801 850 

(768) Bh BMI liBBvtBiBelHBr E^ ^wwNBjQ^s BBBi NE 

(754) SH ffiBl L gp AAElEELEBR ^P ^^WGN^<jEf iB^jNg 

(767) gR BaRB LX^&AABTBE pS H ^®A^WWS^^I^^n| 

(763) gH^pFllSAABTiK^RSSLKGLREEH^S£gLGNjS|L^G^pig 

(779) gH EMB LLEBvABieELEBR EETOB A |^ wwN^QEHs Baag8 si 

(801) YHRLRDLLLIAARI VELLGR RGWEALKYWWNLLQYW QELKNS 



.851 900 

(811) BvjLLNgT^ABAEGgg gvE VViGACggl R^P ^EBg LESlBS 

(797) gVpLFDiipiAiAEGaBB^VABRIGBFLEP BgBBa FBSAB 

(810) KvlwLNBTBaABTEGBBvEvABRAiEBi L^H Baggteg LEiLBI 

(813) BBLLDBT^lBAGW B^V^V ABGA BSgl LBS P BSSaB LggTB 

(822) BvgLFNlTB^&A^G^E^Si^^^SS^SP^^SL^AQ 

(851) AVSLLNATAI AVAEGTDRVI EVAQRAFRAI LH I PRRIRQGLER LL 
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CAGCCTGTfeGGACCAGA 



PCT/US99/31272 



CAGCCT.G^^ 
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AAGT^CJ^CGACMGAAGfTC^CGGCAGCGGCCCCTGCA 
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ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
AC^CACCAACGGCACCATCACeih'bcCCTGCCGCATCAA 
ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
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(2221) 
(2251) 
(2197) 
(2221) 
(2221) 
(2209) 
(2197) 
(2197) 
(2251) 

(2251) 
(2281) 
(2227) 
(2251) 
(2251) 
(2239) 
(2227) 
(2227) 
(2281) 

(2281) 
(2311) 
(2257) 
(2281) 
(2281) 
(2269) 
(2257) 
(2257) 
(2311) 

(2311) 
(2341) 
(2287) 
(2311) 
(2311) 
(2299) 
(2287) 
(2287) 
(2341) 



GACGCCATCGCCATCGCCGTGGCCGA 
2251 _ 2280 




ACCGACCGCATCATCGAGGTGGCCCAGCGC 
2281 2310 




ATCGGCCGCGCCTTCCTGC^ATCCCCCGC 
2311 2340 




CGCATC^GO^MGGC?TCGAGCGCGCCCTG 
2341 2352 





CTGTAACTCGA'G 
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SEQ ID NO:3 VAL120-ALA204 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGGCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCCGGCGCCTGCCCCAA 

GGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCG<XGGCTTCGCCATCrfGAAGTG 

CAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCC 

ACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGC 

GTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGA 

GAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCC 

CCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACA 

TCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTC 

GGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAG 

CTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAA 

CAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGA 

TCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATC 

CGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAA 

CACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGT 

ACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCC 

GCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAG 

CGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGC 

AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTG 

AAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGT 

GCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGA 

TGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGC 

CAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGT 

GGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCG 

GCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCT 

ACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCA 

TCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTG 

GCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGC(^GACCTG 

ATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTAC 

TGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGA 

CGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCG 

GCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAAC 

TCGAG 
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SEQ ID NO:4 VAL120-ILE201 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCITCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACTACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATC^GCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCATCACCCAGGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCr 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCAC 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGC 

CAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGAT 

CAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCG 

AGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAG 

CGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTG 

GGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCT 

GCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACC 

TGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGC 

TACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCAC 

CGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGA 

CCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAG 

GAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCA 

GCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCG 

TGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCC 

AGGGCTACAGCCCCCTGAGCTTCCAGACCCGCITCCCCGCCCCCCGCGGCCCCGACCGCCCCG 

AGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGG 

CCTGCTGGCCCrGATCrGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGC 

CGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCT 

GAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCC 

TGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGC 

GCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGC 

TGTAACTCGAG 
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SEQ ID NO:5 VAL120-ILE201B 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCT 

TTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCA 
CCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTGGGCCACCC 
ACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCTTC^ 

TGTGGAAGAACAACATGGTGGAGCAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGC 

CCTGCGTGCCCGGCATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCXCATCCCCATCCACTACTGCGC 

CCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGT 

GAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCT 

GGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACXATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCC 

CGGCCGCGCCTTCTACGCXACCGGCGACATCATCGGCGACATCCGCC^GGCCCACTGCAACATCAGCGGC 

GAGAAGTGGAACAACACCXTIXjAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGACCATC 

GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTC 

TTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATG 

TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACG 

GCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGC 

GCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGC 

GCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGC 

CGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGT 

GCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGG 

CATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCAT 

CTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAG 

CCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCT 

GATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGG 

ACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAG 

GGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCG 

AGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCT 

GGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCG 

CATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTG 

GATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCAC 

CGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAG 

GGCTTCGAGCGCGCCCTGCTGTAACTCGAGCGTGCT 
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SEQ ID NO:6 LYS121-VAL200 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGGCCCCCGTGATCACCCA 

GGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCTC 

CATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCG 

TGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGG 

CCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTG 

CAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCAT 

CACCATCGGCCGCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGC 

CCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGC 

AGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATC 

GTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTG^ 

AGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCG 

CATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

acggcctgctggccctgatctgggacgacctgcgcagcctgtck:ctgttcagctaccaccgcc 
tgcgcgacctgatcctgatcgccgcccgcatcgtggagctgctgggccgccgcggctgggagg 
ccctgaagtactggggcaacctgctgcagtactggatccaggagctgaagaacagcgccgtg 
agcctgttcgacgccatcgccatcgccgtggccgagggcaccgaccgcatcatcgaggtggcc 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 
CTGCTGTAACTCGAGCGTGCT 
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SEQ ID NO:7: LEU122-SER199 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 
GTCITCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 
TGGAAGGAGGCCACCACCACCCTGTTCITC 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATG.CACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCC 

CCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGC 

GGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAA 

CTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCA 

CCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTC 

CTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAG 

GCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGC 

CCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGG 

CCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTG 

ATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTG 

GAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACA 

CCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGA 

CAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTT 

CATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAA 

CCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCC 

CGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCC 

CTGGTGCACGGCCTGCTGGCCCTGATCTrGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTAC 

CACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGC 

TGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAG 

CGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGA 

GGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGA 

GCGCGCCCTGCTGTAACTCGAGCGTGCT 
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SEQ ID NO:8 VAL120-THR202 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCIXjAAGCCCTGCGTGGGCGGCGCCACCCAGGCCTXj 

CCCCAAGGTGAGCITCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCAC 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGC 

CAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGAT 

CAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCG 

AGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAG 

CGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTG 

GGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCT 

GCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACC 

TGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGC 

TACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCAC 

CGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGA 

CCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAG 

GAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCA 

GCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCG 

TGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCC 

AGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCG 

AGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGG 

CCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCG 

CGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCT 

GAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCC 

TGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGC 

GCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGC 

TGTAACTCGAG 
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SEQ ID NO:9 TRP427-GLY431 

gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagca 

gtcttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtg 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtg 

accctgcactgcaccaacctgaagaacgccaccaacaccaagagcagcaactggaaggagat 

ggaccgcggcgagatcaagaactgcagcttcaaggtgaccaccagcatccgcaacaagatgc 

agaaggagtacgccctgttctacaagctggacgtggtgcccatcgacaacgacaacaccagc 

tacaagctgatcaactgcaacaccagcgtgatcarc^ 

gcccatccccatccactactt3cgcccccg 

gttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggcatccgcc 

ccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatccgc 

agcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagat 

caactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgcct 

tctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgag 

aagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagac 

catcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcitcaactgcg 

gcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcg 

gccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgct 

ggggcggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaacatc 

accggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccg 

ccccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtga 

agatcgagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaag 

cgcgccgtgaccctgggcgccatgttcctgggcttcctgggcgccgccggcagcaccatgggc 

gcccgcagcctgaccctgaccgtgcaggcccgccagctgctgagcggcatcgtgcagcagca 

gaacaacctgctgcgcgccatcgaggcccagcagcacctgctgcagctgaccgtgtggggca 

tcaagcagctgcaggcccgcgtgctggccgtggagcgctacctgaaggaccagcagctgctg 

ggcatctggggctgcagcggcaagctgatctgcaccaccgccgtgccctggaacgccagctg 

gagcaacaagagcctggaccagatctggaacaacatgacctggatggagtgggagcgcgag 

atcgacaactacaccaacctgatctacaccctgatcgaggagagccagaaccagcaggagaa 

gaacgagcaggagctgctggagctggacaagtgggccagcctgtggaactggttcgacatca 

gcaagtggctgtggtacatcaagatcttcatcatgatcgtgggcggcctggtgggcctgcgca 

tcgtgttcaccgtgctgagcatcgtgaaccgcgtgcgccagggctacagccccctgagcttcc 

agacccgcttccccgccccccgcggccccgaccgccccgagggcatcgaggaggagggcggc 

gagcgcgaccgcgaccgcagcagccccctggtgcacggcctgctggccctgatctgggacga 

cctgcgcagcctgtgcctgttcagctaccaccgcctgcgcgacctgatcctgatcgccgcccg 

catcgtggagctgctgggccgccgcggctgggaggccctgaagtactggggcaacctgctgc 

agtactggatccaggagctgaagaacagcgccgtgagcctgttcgacgccatcgccatcgcc 

gtggccgagggcaccgaccgcatcatcgaggtggcccagcgcatcggccgcgccttcctgca 

catcccccgccgcatccgccagggcttcgagcgcgccctgctgtaactcgag 



FIG. 12 



WO 00/39303 



49 / 65 



PCT/US99/31272 



SEQ ID NO:10 ARG426-GLY431 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCXrCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCG<^GAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCGGCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCC 

GCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

GGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCITCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

CAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGC 

CGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 13 



WO 00/39303 



50 / 65 



PCT/US99/31272 



SEQ ID NO:ll ARG426-GLY431B 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCITCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAj\CAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCAGCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCC 

GCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

GGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCITCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

CAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGC 

CGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:12 ARG426-LYS432 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTCGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCGGCAACAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCC 

GCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

GGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

CAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGC 

CGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:13 ASN425-LYS432 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGT 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCC^ 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACGCCC 

CCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCC 

TGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGC 

GGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGA 

GCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCG 

TGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCA 

GCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAAC 

CTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCA 

GCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCT 

GGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAAC 

AAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAA 

CTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGC 

AGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGG 

CTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTC 

ACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGC 

TTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGA 

CCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAG 

CCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGA 

GCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGA 

TCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAG 

GGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGC 

CGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ro NO: 14 ILE424-ALA433 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTCCTGTGTGCTGCTG 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTCTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCXTACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCGGCGGC 

GCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTG 

CTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGG 

CGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

TGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACC 

CTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTG 

ACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCT 

GCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGC 

AGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGC 

TGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAG 

CCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACA 

CCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGA 

GCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTrCGACATCAGCAAGTGGCTGT 

GGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCG 

TGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCC 

CCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGC 

GACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTG 

TGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTG 

CTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCA 

GGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCA 

CCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCA 

TCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:15 ELE423-MET434 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTOT 

GTCITCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTOAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCGGCGGCATG 

TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACC 

CGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACAT 

GCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCG 

TGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGC 

GCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTG 

ACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGC 

CATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCC 

GCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGC 

GGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGA 

CCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACC 

TGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTG 

GAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACAT 

CAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAG 

CATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCC 

CCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGC 

AGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTG 

TTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGC 

CGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCT 

GAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACC 

GCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCC 

AGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 



55 / 65 



PCT/US99/3I272 



SEQ ID NO:16 GLN422-TYR435 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCX:CX:CTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCACKjCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGGGCGGCTACGCC 

CCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGAC 

GGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCC 

CCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATG 

TTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTG 

CAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGA 

GGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGC 

TGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAG 

CTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGAT 

CTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCT 

ACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCT 

GGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGA 

TCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCG 

TGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCG 

GCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAG 

CCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAG 

CTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCG 

CGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGA 

ACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATC 

ATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGC 

TTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 



56 / 65 



PCT/US99/31272 



SEQ ID NO:17 GLN422-TYR435B 

GAATTCGCCACCATGGATGCAATGA^ 
GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTC^ 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACXXXA 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCITCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGGCCCCCTACGCCC 

CCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACG 

GCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGAC 

AACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCC 

CACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGT 

TCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGC 

AGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 

GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCT 

GGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGC 

TGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATC 

TGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTA 

CACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTG 

GACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGAT 

CTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGT 

GAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGG 

CCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGC 

CCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGC 

TACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGC 

GGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAA 

CAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCAT 

CGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTT 

CGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 20 



WO 00/39303 _„ , rB PCT/US99/31272 

57 / 65 

SEQ ID NO:18: LEU122-SER199; ARG426-GLY431 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCITCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCGGCGGCGGCAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCrGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCC 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 

CTGCTGTAACTCGAG 



FIG. 21 



WO 00/39303 „ , „ PCT/US99/31272 

SEQ ID NO:19 LEU122-SER199; ARG426-LYS432 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctxktgctgtgtggagca 

gtcttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtg 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgaagctgggcaacagcgtgat 

cacccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgcccccgcx:gg 

cttcgccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtga 

gcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggc 

agcctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagacx:at 

catcgtgcagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgca 

agagcatcaccatcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatcc 

gccaggcccactgcaacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgacc 

aagctgcaggcccagttcggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccc 

cgagatcgtgatgcacagcttcaactgcggcggcgagttctrctactgcaacagcacccagct 

gttcaacagcacctggaacaacaccatcggccccaacaacaccaacggcaccatcaccctgc 

cctgccgcatcaagcagatcatcaaccgcggcggcaacaaggccatgtacgccccccccatcc 

gcggccagatccgctgcagcagcaacatcaccggcctgctgctgaccx:gcgacggcggcaag 

gagatcagcaacaccaccgagatcttccgccccggcggcggcgacatgcgcgacaactggcg 

cagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccaccaagg 

ccaagcgccgcgtggtgcagcgcgagaagcgcgccgtgaccctgggcgccatgttcctgggc 

ttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgc 

cagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagca 

gcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtgg 

agcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgc 

accaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaa 

catgacctggatg g agtgg gagcgcg agatcgacaactacaccaacctgatctacaccctg a 

tcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtg 

ggccagcctgtggaactggttcgacatcagcaagtggctgtggtacatcaagatcttcatcat 

gatcgtgggcggcctggtgggcctgcgcatcgtgttcaccgtgctgagcatcgtgaaccgcgt 

gcgccagggctacagccccctgagcttccagacccgcttccccgccccccgcggccccgaccg 

ccccgagggcatcgaggaggagggcggcgagcgcgaccgcgaccgcagcagccccctggtgc 

acggcctgctggccctgatctgggacgacctgcgcagcctgtgcctgttcagctaccaccgcc 

tgcgcgacctgatcctgatcgccgcccgcatcgtggagctgctgggccgccgcggctgggagg 

ccctgaagtactggggcaacctgctgcagtactggatccaggagctgaagaacagcgccgtg 

agcctgttcgacgccatcgccatcgccgtggccgagggcaccgaccgcatcatcgaggtggcc 

cagcgcatcggccgcgccttcctgcacatcccccgccgcatccgccagggcttcgagcgcgcc 

ctgctgtaactcgag 



FIG. 22 



WO 00/39303 59 j 65 PCT/US99/31272 

SEQ ID NO: 20: LEU122-SER199; TRP427-GLY431 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCITCGAGCCCATCCCCATCCACrACTGCGCCCCCGCC 

CTrCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAG1TCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCTGGGGCGGCAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCrrCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCC 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 

CTGCTGTAACTCGAG 



FIG. 23 



WO 00/39303 , A , PCT/US99/31272 

6U / 65 

SEQ ID NO:21 LYS121-VAL200; ASN425-LYS432 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGGCCCCCGTGATCACCCA 

GGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGC 

CATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCG 

TGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGG 

CCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTG 

CAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCAT 

CACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGC 

CCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGC 

AGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATC 

GTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAAC 

AGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCG 

CATCAAGCAGATCATCAACGCCCCCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCG 

CTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACA 

CCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTAC 

AAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGT 

GGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGC 

CGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCG 

GCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAG 

CTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAA 

GGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGC 

CCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATG 

GAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCA 

GAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGG 

AACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGC 

CTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTAC 

AGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATC 

GAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGC 

CCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGAT 

CCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTG 

GGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACG 

CCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGC 

CGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTC 

GAG 



FIG. 24 



WO 00/39303 6J ; 65 PCT/US99/31272 

SEQ ID NO:22 VAL120-ELE201; ELE 424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCXn'GCGTGGGCGGCATCACCCAGGCCTG 

CCCXAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTrCTACTGCAACAGCACCCAGCTGTTCAAC^ 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCGGCGGCGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGC 

AACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGAT 

CTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGG 

TGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGC 

GAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACC 

ATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCA 

GCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGT 

GGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAG 

CTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCC 

AGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCG 

CGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGG 

AGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGITCGAC 

ATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTG 

CGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGC 

TTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGG 

CGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGG 

ACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCG 

CCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTG 

CTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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62 / 65 

SEQ ID NO:23: VAL120-ILE201B; ILE424-ALA433 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagca 

gtcttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtg 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgcccggcatcacccaggcctgc 

cccaaggtgagcttcgagcccatccccatccact^ 

aagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtg 

cacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgagg 

agggcgtggtgatccgcagcgagaacttcaccgacaacgccaagacx:atcatcgtgcagctg 

aaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcaccat 

cggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactg 

caacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggccc 

agttcggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatg 

cacagcttcaactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacc 

tggaacaacaccatcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaa 

gcagatcatcggcggcgccatgtacgccccccccatccgcggccagatccgctgcagcagca 

acatcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatc 

TTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGT 

GGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCG 

AGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCA 

TGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAG 

CAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTG 

GGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGC 

TGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCA 

GCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGC 

GAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGA 

GAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACA 

TCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGC 

GCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCT 

TCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGC 

GGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGA 

CGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGC 

CCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGC 

TGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ED NO:24 VAL120-THR202; ILE424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCGCCACCCAGGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCG<XCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCAC 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCGGCGGCGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGC 

AACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGAT 

CTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGG 

TGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGC 

GAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACC 

ATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGfGCA 

GCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGT 

GGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAG 

CTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCC 

AGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCG 

CGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGG 

AGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGAC 

ATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTG 

CGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGC 

TTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGG 

CGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGG 

ACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCG 

CCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTG 

CTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:25 VAL127-ASN195 

gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgct^ 

gtcttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtg 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgaagctgacccccxjrgtgcgtg 

ggggcagggaactgcaacaccagcgtgatcacccaggcctgccccaaggtgagcitc 

catccccatccactactgcgcccccgccggcttcgccatcctgaagtgcaacgacaagaagtt 

caacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggcatccgccccg 

tggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatccgcagc 

gagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagatcaa 

ctgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcck:cttcta 

CGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

ggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagaccatc 

GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGGCGG 

cgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcggccc 

caacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgctggc 

aggaggtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaac 

atcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatctt 

ccgccccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtgg 

tgaagatcgagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgag 

aagcgcgccgtgaccctgggcgccatgttcctgggcttcctgggcgccgccggcagcaccatg 

ggcgcccgcagcctgaccctgaccgtgcaggcccgccagctgctgagcggcatcgtgcagca 

gcagaacaacctgctgcgcgccatcgaggcccagcagcacctgctgcagctgaccgtgtggg 

gcatcaagcagctgcaggcccgcgtgctggccgtggagcgctacctgaaggaccagcagctg 

ctgggcatctggggctgcagcggcaagctgatctgcaccaccgccgtgccctggaacgccag 

ctggagcaacaagagcctggaccagatctggaacaacatgacctggatggagtgggagcgcg 

agatcgacaactacaccaacctgatctacaccctgatcgaggagagccagaaccagcaggag 

aagaacgagcaggagctgctggagctggacaagtgggccagcctgtggaactggttcgacat 

cagcaagtggctgtggtacatcaagatcttcatcatgatcgtgggcggcctggtgggcctgcg 

catcgtgttcaccgtgctgagcatcgtgaaccgcgtgcgccagggctacagccccctgagctt 

ccagacccgcttccccgccccccgcggccccgaccgccccgagggcatcgaggaggagggcg 

gcgagcgcgaccgcgaccgcagcagccccctggtgcacggcctgctggccctgatctgggac 

gacctgcgcagcctgtgcctgttcagctaccaccgcctgcgcgacctgatcctgatcgccgcc 

cgcatcgtggagctgctgggccgccgcggctgggaggccctgaagtactggggcaacctgct 

gcagtactggatccaggagctgaagaacagcgccgtgagcctgttcgacgccatcgccatcg 

ccgtggccgagggcaccgaccgcatcatcgaggtggcccagcgcatcggccgcgccttcctgc 

acatcccccgccgcatccgccagggcttcgagcgcgccctgctgtaactcgag 
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SEQ ID NO:26 VAL127-ASN195; ARG426-GLY431 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagca 

gtcttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtg 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccxxaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtg 

ggggcagggaactgcaacaccagcgtgatcacccaggcctgccccaaggtgagcttcgagcc 

catccccatccactactgcgcccccgccggcttcgccatcctgaagtgcaacgacaagaagtt 

caacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggcatccgccccg 

tggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatccgcagc 

gagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagatcaa 

ctgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgccttct^ 

cgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgagaagt 

ggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagaccatc 

gtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggcgg 

cgagttcttctactck:aacagcacccagctgttcaacagcacctggaacaacaccatcggccc 

caacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgcggcg 

gcggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaacatcacc 

ggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgccc 

cgggggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaag 

atcgagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcg 

CGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGC 

CCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGA 

ACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATC 

AAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGG 

CATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGA 

GCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATC 

GACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAA 

CGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCA 

AGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCG 

TGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGA 

CCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAG 

cgcgaccgcgaccgcagcagccccctggtgcacggcctgctggccctgatctgggacgacctg 

cgcagcctgtgcctgttcagctaccaccgcctgcgcgacctgatcctgatcgccgcccgcatc 

gtggagctgctgggccgccgcggctgggaggccctgaagtactggggcaacctgctgcagta 

ctggatccaggagctgaagaacagcgccgtgagcctgttcgacgccatcgccatcgccgtgg 

ccgagggcaccgaccgcatcatcgaggtggcccagcgcatcggccgcgccttcctgcacatcc 

cccgccck:atccgccagggcttcgagcgcgccctgctgtaactcgag 
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<110> Chiron Corporation 

<120> MODIFIED HIV ENV POLYPEPTIDES 

<130> 1605.100 

<140> 
<141> 

<160> 26 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 856 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 1 

Met Arg Val Lys Glu Lys Tyr Gin His Leu Trp Arg Trp Gly Trp Arg 
15 10 15 

Trp Gly Thr Met Leu Leu Gly Met Leu Met lie Cys Ser Ala Thr Glu 
20 25 30 

Lys Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala 
35 40 45 

Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu 
50 55 60 

Val His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn 
65 70 75 80 

Pro Gin Glu Val Val Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp 
85 90 95 

Lys Asn Asp Met Val Glu Gin Met His Glu Asp lie lie Ser Leu Trp 
100 105 110 

Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Ser 
115 120 125 

Leu. Lys Cys Thr Asp Leu Lys Asn Asp Thr Asn Thr Asn Ser Ser Ser 
130 135 140 

Gly Arg Met lie Met Glu Lys Gly Glu lie Lys Asn Cys Ser Phe Asn 
145 150 155 160 

lie Ser Thr Ser lie Arg Gly Lys Val Gin Lys Glu Tyr Ala Phe Phe 
165 170 175 



Tyr Lys Leu Asp lie He Pro He Asp Asn Asp Thr Thr Ser Tyr Lys 
180 185 190 
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Leu Thr Ser Cys 
195 

Ser Phe Glu Pro 
210 

lie Leu Lys Cys 
225 

Asn Val Ser Thr 



Thr Gin Leu Leu 
260 

Arg Ser Val Asn 
275 

Asn Thr Ser Val 
290 

Lys Arg lie Arg 
305 

Gly Lys He Gly 



Lys Trp Asn Asn 
340 

Phe Gly Asn Asn 
355 

Pro Glu He Val 
370 

Cys Asn Ser Thr 
385 

Ser Thr Glu Gly 



Pro Cys Arg He 
420 

Ala Met Tyr Ala 
435 

He Thr Gly Leu 
450 

Ser Glu He Phe 
465 

Ser Glu Leu Tyr 



Ala Pro Thr Lys 
500 



Asn Thr Ser Val 
200 

He Pro He His 
215 

Asn Asn Lys Thr 
230 

Val Gin Cys Thr 
245 

Leu Asn Gly Ser 



Phe Thr Asp Asn 
280 

Glu He Asn Cys 
295 

He Gin Arg Gly 
310 

Asn Met Arg Gin 
325 

Thr Leu Lys Gin 



Lys Thr He He 
360 

Thr His Ser Phe 
375 

Gin Leu Phe Asn 
390 

Ser Asn Asn Thr 
405 

Lys Gin He He 



Pro Pro He Ser 
440 

Leu Leu Thr Arg 
455 

Arg Pro Gly Gly 
470 

Lys Tyr Lys Val 
485 

Ala Lys Arg Arg 



He Thr Gin Ala 



Tyr Cys Ala Pro 
220 

Phe Asn Gly Thr 
235 

His Gly lie Arg 
250 

Leu Ala Glu Glu 
265 

Ala Lys Thr He 



Thr Arg Pro Asn 
300 

Pro Gly Arg Ala 
315 

Ala His Cys Asn 
330 

He Ala Ser Lys 
345 

Phe Lys Gin Ser 



Asn Cys Gly Gly 
380 

Ser Thr Trp Phe 
395 

Glu Gly Ser Asp 
410 

Asn Met Trp Gin 
425 

Gly Gin He Arg 



Asp Gly Gly Asn 
460 

Gly Asp Met Arg 
475 

Val Lys He Glu 
490 

Val Val Gin Arg 
505 



Cys Pro Lys Val 
205 

Ala Gly Phe Ala 



Gly Pro Cys Thr 
240 

Pro Val Val Ser 
255 

Glu Val Val He 
270 

He Val Gin Leu 
285 

Asn Asn Thr Arg 



Phe Val Thr He 
320 

He Ser Arg Ala 
335 

Leu Arg Glu Gin 
350 

Ser Gly Gly Asp 
365 

Glu Phe Phe Tyr 



Asn Ser Thr Trp 
400 

Thr He Thr Leu 
415 

Lys Val Gly Lys 
430 

Cys Ser Ser Asn 
445 

Ser Asn Asn Glu 



Asp Asn Trp Arg 
480 

Pro Leu Gly Val 
495 

Glu Lys Arg Ala 
510 
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Val Gly He Gly 
515 

Thr Met Gly Ala 
530 

Leu Ser Gly He 
545 

Ala Gin Gin His 



Gin Ala Arg He 
580 

Leu Gly He Trp 
595 

Pro Trp Asn Ala 
610 

His Thr Thr Trp 
625 

Leu He His Ser 



Glu Gin Glu Leu 
660 

Phe Asn He Thr 
675 

Val Gly Gly Leu 
690 

Val Asn Arg Val 
705 

Leu Pro Thr Pro 



Gly Gly Glu Arg 
740 

Leu Ala Leu He 
755 

His Arg Leu Arg 
770 

Leu Gly Arg Arg 
785 

Gin Tyr Trp Ser 



Ala Thr Ala He 
820 



Ala Leu Phe Leu 
520 

Ala Ser Met Thr 
535 

Val Gin Gin Gin 
550 

Leu Leu Gin Leu 
565 

Leu Ala Val Glu 



Gly Cys Ser Gly 
600 

Ser Trp Ser Asn 
615 

Met Glu Trp Asp 
630 

Leu He Glu Glu 
645 

Leu Glu Leu Asp 



Asn Trp Leu Trp 
680 

Val Gly Leu Arg 
695 

Arg Gin Gly Tyr 
710 

Arg Gly Pro Asp 
725 

Asp Arg Asp Arg 



Trp Asp Asp Leu 
760 

Asp Leu Leu Leu 
775 

Gly Trp Glu Ala 
790 

Gin Glu Leu Lys 
805 

Ala Val Ala Glu 



Gly Phe Leu Gly 



Leu Thr Val Gin 
540 

Asn Asn Leu Leu 
555 

Thr Val Trp Gly 
570 

Arg Tyr Leu Lys 
585 

Lys Leu He Cys 



Lys Ser Leu Glu 
620 

Arg Glu lie Asn 
635 

Ser Gin Asn Gin 
650 

Lys Trp Ala Ser 
665 

Tyr He Lys Leu 



lie Val Phe Ala 
700 

Ser Pro Leu Ser 
715 

Arg Pro Glu Gly 
730 

Ser He Arg Leu 
745 

Arg Ser Leu Cys 



lie Val Thr Arg 
780 

Leu Lys Tyr Trp 
795 

Asn Ser Ala Val 
810 

Gly Thr Asp Arg 
825 



Ala Ala Gly Ser 
525 

Ala Arg Gin Leu 



Arg Ala He Glu 
560 

lie Lys Gin Leu 
575 

Asp Gin Gin Leu 
590 

Thr Thr Ala Val 
605 

Gin lie Trp Asn 



Asn Tyr Thr Ser 
64 0 

Gin Glu Lys Asn 
655 

Leu Trp Asn Trp 
670 

Phe lie Met lie 
665 

Val Leu Ser lie 



Phe Gin Thr His 
720 

lie Glu Glu Glu 
735 

Val Asn Gly Ser 
750 

Leu Phe Ser Tyr 
765 

lie Val Glu Leu 



Trp Asn Leu Leu 
800 

Ser Leu Leu Asn 
815 

Val lie Glu Val 
830 
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Val Gin Gly Ala Cys Arg Ala lie Arg His lie Pro Arg Arg He Arg 
835 840 845 

Gin Gly Leu Glu Arg He Leu Leu 
850 855 



<210> 2 
<211> 847 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 2 

Met Arg Val Lys Gly He Arg Lys Asn Tyr Gin His Leu Trp Arg Gly 
15 10 15 

Gly Thr Leu Leu Leu Gly Met Leu Met He Cys Ser Ala Val Glu Lys 
20 25 30 

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 
35 40 45 

Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu Val 
50 55 60 

His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn Pro 
65 70 75 80 

Gin Glu He Val Leu Glu Asn Val Thr Glu Asn Phe Asn Met Trp Lys 
85 90 95 

Asn Asn Met Val Glu Gin Met His Glu Asp He He Ser Leu Trp Asp 
100 105 110 

Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 
115 120 125 

His Cys Thr Asn Leu Lys Asn Ala Thr Asn Thr Lys Ser Ser Asn Trp 
130 135 140 

Lys Glu Met Asp Arg Gly Glu He Lys Asn Cys Ser Phe Lys Val Thr 
145 150 155 160 

Thr Ser He Arg Asn Lys Met Gin Lys Glu Tyr Ala Leu Phe Tyr Lys 
165 170 - 175 

Leu Asp Val Val Pro He Asp Asn Asp Asn Thr Ser Tyr Lys Leu He 
180 185 190 

Asn Cys Asn Thr Ser Val He Thr Gin Ala Cys Pro Lys Val Ser Phe 
195 200 205 

Glu Pro lie Pro He His Tyr Cys Ala Pro Ala Gly Phe Ala He Leu 
210 215 220 

Lys Cys Asn Asp Lys Lys Phe Asn Gly Ser Gly Pro Cys Thr Asn Val 
225 230 235 240 
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Ser Thr Val Gin Cys Thr His Gly lie Arg Pro Val Val Ser Thr Gin 
245 250 255 

Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Gly Val Val lie Arg Ser 
260 265 270 

Glu Asn Phe Thr Asp Asn Ala Lys Thr lie lie Val Gin Leu Lys Glu 
275 280 285 

Ser Val Glu lie Asn Cys Thr Arg Pro Asn Asn Asn Thr Arg Lys Ser 
290 295 300 

lie Thr lie Gly Pro Gly Arg Ala Phe Tyr Ala Thr Gly Asp lie He 
305 310 315 320 

Gly Asp He Arg Gin Ala His Cys Asn He Ser Gly Glu Lys Trp Asn 
325 330 335 

Asn Thr Leu Lys Gin He Val Thr Lys Leu Gin Ala Gin Phe Gly Asn 
340 345 350 

Lys Thr He Val Phe Lys Gin Ser Ser Gly Gly Asp Pro Glu He Val 
355 360 365 

Met His Ser Phe Asn Cys Gly Gly Glu Phe Phe Tyr Cys Asn Ser Thr 
370 375 380 

Gin Leu Phe Asn Ser Thr Trp Asn Asn Thr lie Gly Pro Asn Asn Thr 
385 390 395 400 

Asn Gly Thr lie Thr Leu Pro Cys Arg lie Lys Gin lie lie Asn Arg 
405 410 415 

Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro Pro lie Arg Gly Gin 
420 425 430 

lie Arg Cys Ser Ser Asn lie Thr Gly Leu Leu Leu Thr Arg Asp Gly 
435 440 445 

Gly Lys Glu lie Ser Asn Thr Thr Glu lie Phe Arg Pro Gly Gly Gly 
450 455 460 

Asp Met Arg Asp Asn Trp Arg Ser Glu Leu Tyr Lys Tyr Lys Val Val 
465 470 475 480 

Lys lie Glu Pro Leu Gly Val Ala Pro Thr Lys Ala Lys Arg Arg Val 
485 490 495 

Val Gin Arg Glu Lys Arg Ala Val Thr Leu Gly Ala Met Phe Leu Gly 
500 505 510 

Phe Leu Gly Ala Ala Gly Ser Thr Met Gly Ala Arg Ser Leu Thr Leu 
515 520 525 

Thr Val Gin Ala Arg Gin Leu Leu Ser Gly lie Val Gin Gin Gin Asn 
530 535 540 

Asn Leu Leu Arg Ala lie Glu Ala Gin Gin His Leu Leu Gin Leu Thr 
545 550 555 560 



5 



WO 00/39303 



PCT/US99/31272 



Val Trp Gly He Lys Gin Leu Gin Ala Arg Val Leu Ala Val Glu Arg 
565 570 575 

Tyr Leu Lys Asp Gin Gin Leu Leu Gly He Trp Gly Cys Ser Gly Lys 
580 585 590 

Leu He Cys Thr Thr Ala Val Pro Trp Asn Ala Ser Trp Ser Asn Lys 
595 600 605 

Ser Leu Asp Gin He Trp Asn Asn Met Thr Trp Met Glu Trp Glu Arg 
610 615 620 

Glu He Asp Asn Tyr Thr Asn Leu He Tyr Thr Leu He Glu Glu Ser 
625 630 635 640 

Gin Asn Gin Gin Glu Lys Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys 
645 650 655 

Trp Ala Ser Leu Trp Asn Trp Phe Asp He Ser Lys Trp Leu Trp Tyr 
660 665 670 

He Lys He Phe He Met lie Val Gly Gly Leu Val Gly Leu Arg He 
675 680 685 

Val Phe Thr Val Leu Ser He Val Asn Arg Val Arg Gin Gly Tyr Ser 
690 695 700 

Pro Leu Ser Phe Gin Thr Arg Phe Pro Ala Pro Arg Gly Pro Asp Arg 
705 710 715 720 

Pro Glu Gly He Glu Glu Glu Gly Gly Glu Arg Asp Arg Asp Arg Ser 
725 730 735 

Ser Pro Leu Val His Gly Leu Leu Ala Leu He Trp Asp Asp Leu Arg 
740 745 750 

Ser Leu Cys Leu Phe Ser Tyr His Arg Leu Arg Asp Leu He Leu He 
755 760 765 

Ala Ala Arg lie Val Glu Leu Leu Gly Arg Arg Gly Trp Glu Ala Leu 
770 775 780 

Lys Tyr Trp Gly Asn Leu Leu Gin Tyr Trp He Gin Glu Leu Lys Asn 
785 790 795 800 

Ser Ala Val Ser Leu Phe Asp Ala He Ala lie Ala Val Ala Glu Gly 
805 810 815 

Thr Asp Arg He lie Glu Val Ala Gin Arg lie Gly Arg Ala Phe Leu 
820 825 830 

His lie Pro Arg Arg lie Arg Gin Gly Phe Glu Arg Ala Leu Leu 
835 840 845 



<210> 3 

<211> 2310 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Vall20-Ala204 
<400> 3 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 360 
ggcgcctgcc ccaaggtgag cttcgagccc atccccatcc actactgcgc ccccgccggc 420 
ttcgccatcc tgaagtgcaa cgacaagaag ttcaacggca gcggcccctg caccaacgtg 480 
agcaccgtgc agtgcaccca cggcatccgc cccgtggtga gcacccagct gctgctgaac 54 0 
ggcagcctgg ccgaggaggg cgtggtgatc cgcagcgaga acttcaccga caacgccaag 600 
accatcatcg tgcagctgaa ggagagcgtg gagatcaact gcacccgccc caacaacaac 660 
acccgcaaga gcatcaccat cggccccggc cgcgccttct acgccaccgg cgacatcatc 720 
ggcgacatcc gccaggccca ctgcaacatc agcggcgaga agtggaacaa caccctgaag 780 
cagatcgtga ccaagctgca ggcccagttc ggcaacaaga ccatcgtgtt caagcagagc 84 0 
agcggcggcg accccgagat cgtgatgcac agcttcaact gcggcggcga gttcttctac 900 
tgcaacagca cccagctgtt caacagcacc tggaacaaca ccatcggccc caacaacacc 960 
aacggcacca tcaccctgcc ctgccgcatc aagcagatca tcaaccgctg gcaggaggtg 1020 
ggcaaggcca tgtacgcccc ccccatccgc ggccagatcc gctgcagcag caacatcacc 1080 
ggcctgctgc tgacccgcga cggcggcaag gagatcagca acaccaccga gatcttccgc 114 0 
cccggcggcg gcgacatgcg cgacaactgg cgcagcgagc tgtacaagta caaggtggtg 1200 
aagatcgagc ccctgggcgt ggcccccacc aaggccaagc gccgcgtggt gcagcgcgag 1260 
aagcgcgccg tgaccctggg cgccatgttc ctgggcttcc tgggcgccgc cggcagcacc 1320 
atgggcgccc gcagcctgac cctgaccgtg caggcccgcc agctgctgag cggcatcgtg 13 80 
cagcagcaga acaacctgct gcgcgccatc gaggcccagc agcacctgct gcagctgacc 1440 
gtgtggggca tcaagcagct gcaggcccgc gtgctggccg tggagcgcta cctgaaggac 1500 
cagcagctgc tgggcatctg gggctgcagc ggcaagctga tctgcaccac cgccgtgccc 1560 
tggaacgcca gctggagcaa caagagcctg gaccagatct ggaacaacat gacctggatg 162 0 
gagtgggagc gcgagatcga caactacacc aacctgatct acaccctgat cgaggagagc 1680 
cagaaccagc aggagaagaa cgagcaggag ctgctggagc tggacaagtg ggccagcctg 174 0 
tggaactggt tcgacatcag caagtggctg tggtacatca agatcttcat catgatcgtg 1800 
ggcggcctgg tgggcctgcg catcgtgttc accgtgctga gcatcgtgaa ccgcgtgcgc 1860 
cagggctaca gccccctgag cttccagacc cgcttccccg ccccccgcgg ccccgaccgc 1920 
cccgagggca tcgaggagga gggcggcgag cgcgaccgcg accgcagcag ccccctggtg 1980 
cacggcctgc tggccctgat ctgggacgac ctgcgcagcc tgtgcctgtt cagctaccac 2040 
cgcctgcgcg acctgatcct gatcgccgcc cgcatcgtgg agctgctggg ccgccgcggc 2100 
tgggaggccc tgaagtactg gggcaacctg ctgcagtact ggatccagga gctgaagaac 2160 
agcgccgtga gcctgttcga cgccatcgcc atcgccgtgg ccgagggcac cgaccgcatc 2220 
atcgaggtgg cccagcgcat cggccgcgcc ttcctgcaca tcccccgccg catccgccag 2280 
ggcttcgagc gcgccctgct gtaactcgag 2310 

<210> 4 
<211> 2316 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20- Ile20 1 



<400> 4 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgggcggc 360 
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atcacccagg cctgccccaa ggtgagcttc 
gccggcttcg ccatcctgaa gtgcaacgac 
aacgtgagca ccgtgcagtg cacccacggc 
ctgaacggca gcctggccga ggagggcgtg 
gccaagacca tcatcgtgca gctgaaggag 
aacaacaccc gcaagagcat caccatcggc 
atcatcggcg acatccgcca ggcccactgc 
ctgaagcaga tcgtgaccaa gctgcaggcc 
cagagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acagcaccca gctgttcaac 
aacaccaacg gcaccatcac cctgccctgc 
gaggtgggca aggccatgta cgcccccccc 
atcaccggcc tgctgctgac ccgcgacggc 
ttccgccccg gcggcggcga catgcgcgac 
gtggtgaaga tcgagcccct gggcgtggcc 
cgcgagaagc gcgccgtgac cctgggcgcc 
agcaccatgg gcgcccgcag cctgaccctg 
atcgtgcagc agcagaacaa cctgctgcgc 
ctgaccgtgt ggggcatcaa gcagctgcag 
aaggaccagc agctgctggg catctggggc 
gtgccctgga acgccagctg gagcaacaag 
tggatggagt gggagcgcga gatcgacaac 
gagagccaga accagcagga gaagaacgag 
agcctgtgga actggttcga catcagcaag 
atcgtgggcg gcctggtggg cctgcgcatc 
gtgcgccagg gctacagccc cctgagcttc 
gaccgccccg agggcatcga ggaggagggc 
ctggtgcacg gcctgctggc cctgatctgg 
taccaccgcc tgcgcgacct gatcctgatc 
cgcggctggg aggccctgaa gtactggggc 
aagaacagcg ccgtgagcct gttcgacgcc 
cgcatcatcg aggtggccca gcgcatcggc 
cgccagggct tcgagcgcgc cctgctgtaa 

<210> 5 
<211> 2322 
<212> DNA 

<213> Artificial Sequence 



PCI7US99/31272 

gagcccatcc ccatccacta ctgcgccccc 420 
aagaagtcca acggcagcgg cccctgcacc 4 80 
atccgccccg tggtgagcac ccagctgctg 54 0 
gtgatccgca gcgagaactt caccgacaac 600 
agcgtggaga tcaactgcac ccgccccaac 660 
cccggccgcg ccttctacgc caccggcgac 72 0 
aacatcagcg gcgagaagtg gaacaacacc 780 
cagttcggca acaagaccat cgtgttcaag 840 
atgcacagct tcaactgcgg cggcgagttc 900 
agcacctgga acaacaccat cggccccaac 960 
cgcatcaagc agatcatcaa ccgctggcag 1020 
atccgcggcc agatccgctg cagcagcaac 1080 
ggcaaggaga tcagcaacac caccgagatc 114 0 
aactggcgca gcgagctgta caagtacaag 1200 
cccaccaagg ccaagcgccg cgtggtgcag 1260 
atgttcctgg gcttcctggg cgccgccggc 1320 
accgtgcagg cccgccagct gctgagcggc 13 80 
gccatcgagg cccagcagca cctgctgcag 1440 
gcccgcgtgc tggccgtgga gcgctacctg 1500 
tgcagcggca agctgatctg caccaccgcc 1560 
agcctggacc agatctggaa caacatgacc 1620 
tacaccaacc tgatctacac cctgatcgag 1680 
caggagctgc tggagctgga caagtgggcc 174 0 
tggctgtggt acatcaagat cttcatcatg 1800 
gtgttcaccg tgctgagcat cgtgaaccgc 1860 
cagacccgct tccccgcccc ccgcggcccc 1920 
ggcgagcgcg accgcgaccg cagcagcccc 1980 
gacgacctgc gcagcctgtg cctgttcagc 2040 
gccgcccgca tcgtggagct gctgggccgc 2100 
aacctgctgc agtactggat ccaggagctg 2160 
atcgccatcg ccgtggccga gggcaccgac 2220 
cgcgccttcc tgcacatccc ccgccgcatc 2280 
ctcgag 2316 



<220> 

<223> Description of Artificial Sequence: Vall20-Ile201B 



<400> 5 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

atcacccagg 

gccggcttcg 

aacgtgagca 

ctgaacggca 

gccaagacca 

aacaacaccc 

atcatcggcg 

ctgaagcaga 

cagagcagcg 

ttctactgca 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
cctgccccaa 
ccatcctgaa 
ccgtgcagtg 
gcctggccga 
tcatcgtgca 
gcaagagcat 
acatccgcca 
tcgtgaccaa 
gcggcgaccc 
acagcaccca 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
ggtgagcttc 
gtgcaacgac 
cacccacggc 
ggagggcgtg 
gctgaaggag 
caccatcggc 
ggcccactgc 
gctgcaggcc 
cgagatcgtg 
gctgttcaac 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
gagcccatcc 
aagaagttca 
atccgccccg 
gtgatccgca 
agcgtggaga 
cccggccgcg 
aacatcagcg 
cagttcggca 
atgcacagct 
agcacctgga 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
ccatccacta 
acggcagcgg 
tggtgagcac 
gcgagaactt 
tcaactgcac 
ccttctacgc 
gcgagaagtg 
acaagaccat 
tcaactgcgg 
acaacaccat 



gctgtgtgga 60 
ctacggcgtg 12 0 
ggcctacgac 180 
caacccccag 240 
catggtggag 300 
cgtgcccggc 360 
ctgcgccccc 420 
cccctgcacc 480 
ccagctgctg 54 0 
caccgacaac 600 
ccgccccaac 660 
caccggcgac 720 
gaacaacacc 780 
cgtgttcaag 84 0 
cggcgagttc 900 
cggccccaac 960 
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aacaccaacg 
gaggtgggca 
atcaccggcc 
ttccgccccg 
gtggtgaaga 
cgcgagaagc 
agcaccatgg 
atcgtgcagc 
ctgaccgtgt 
aaggaccagc 
gtgccctgga 
tggatggagt 
gagagccaga 
agcctgtgga 
atcgtgggcg 
gtgcgccagg 
gaccgccccg 
ctggtgcacg 
taccaccgcc 
cgcggctggg 
aagaacagcg 
cgcatcatcg 
cgccagggct 



gcaccatcac 
aggccatgta 
tgctgctgac 
gcggcggcga 
tcgagcccct 
gcgccgtgac 
gcgcccgcag 
agcagaacaa 
ggggcatcaa 
agctgctggg 
acgccagctg 
gggagcgcga 
accagcagga 
actggttcga 
gcctggtggg 
gctacagccc 
agggcatcga 
gcctgctggc 
tgcgcgacct 
aggccctgaa 
ccgtgagcct 
aggtggccca 
tcgagcgcgc 



cctgccctgc 
cgcccccccc 
ccgcgacggc 
catgcgcgac 
gggcgtggcc 
cctgggcgcc 
cctgaccctg 
cctgctgcgc 
gcagctgcag 
catctggggc 
gagcaacaag 
gatcgacaac 
gaagaacgag 
catcagcaag 
cctgcgcacc 
cctgagcttc 
ggaggagggc 
cctgatctgg 
gatcctgatc 
gtactggggc 
gttcgacgcc 
gcgcatcggc 
cctgctgtaa 



cgcatcaagc 
atccgcggcc 
ggcaaggaga 
aactggcgca 
cccaccaagg 
atgttcctgg 
accgtgcagg 
gccatcgagg 
gcccgcgtgc 
tgcagcggca 
agcctggacc 
tacaccaacc 
caggagctgc 
tggctgtggt 
gtgttcaccg 
cagacccgct 
ggcgagcgcg 
gacgacctgc 
gccgcccgca 
aacctgctgc 
atcgccatcg 
cgcgccttcc 
ctcgagcgtg 



agatcatcaa 
agatccgctg 
tcagcaacac 
gcgagctgta 
ccaagcgccg 
gcttcctggg 
cccgccagct 
cccagcagca 
tggccgtgga 
agctgatctg 
agatctggaa 
tgatctacac 
tggagctgga 
acatcaagat 
tgctgagcat 
tccccgcccc 
accgcgaccg 
gcagcctgtg 
tcgtggagct 
agtactggat 
ccgtggccga 
tgcacatccc 
ct 



ccgctggcag 
cagcagcaac 
caccgagatc 
caagtacaag 
cgtggtgcag 
cgccgccggc 
gctgagcggc 
cctgctgcag 
gcgctacctg 
caccaccgcc 
caacatgacc 
cctgatcgag 
caagtgggcc 
cttcatcatg 
cgtgaaccgc 
ccgcggcccc 
cagcagcccc 
cctgttcagc 
gctgggccgc 
ccaggagctg 
gggcaccgac 
ccgccgcatc 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2322 



<210> 6 

<211> 2328 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Lysl21-Val200 



<400> 6 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

cccgtgatca 

gcccccgccg 

tgcaccaacg 

ctgctgctga 

gacaacgcca 

cccaacaaca 

ggcgacatca 

aacaccctga 

ttcaagcaga 

gagttcttct 

cccaacaaca 

tggcaggagg 

agcaacatca 

gagatcttcc 

tacaaggtgg 

gtgcagcgcg 

gccggcagca 

agcggcatcg 

ctgcagctga 

tacctgaagg 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
cccaggcctg 
gcttcgccat 
tgagcaccgt 
acggcagcct 
agaccatcat 
acacccgcaa 
tcggcgacat 
agcagatcgt 
gcagcggcgg 
actgcaacag 
ccaacggcac 
tgggcaaggc 
ccggcctgct 
gccccggcgg 
tgaagatcga 
agaagcgcgc 
ccatgggcgc 
tgcagcagca 
ccgtgtgggg 
accagcagct 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
ccccaaggtg 
cctgaagtgc 
gcagtgcacc 
ggccgaggag 
cgtgcagctg 
gagcatcacc 
ccgccaggcc 
gaccaagctg 
cgaccccgag 
cacccagctg 
catcaccctg 
catgtacgcc 
gctgacccgc 
cggcgacatg 
gcccctgggc 
cgtgaccctg 
ccgcagcctg 
gaacaacctg 
catcaagcag 
gctgggcatc 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
agcttcgagc 
aacgacaaga 
cacggcatcc 
ggcgtggtga 
aaggagagcg 
atcggccccg 
cactgcaaca 
caggcccagt 
atcgtgatgc 
ttcaacagca 
ccctgccgca 
ccccccatcc 
gacggcggca 
cgcgacaact 
gtggccccca 
ggcgccatgt 
accctgaccg 
ctgcgcgcca 
ctgcaggccc 
tggggctgca 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
ccatccccat 
agttcaacgg 
gccccgtggt 
tccgcagcga 
tggagatcaa 
gccgcgcctt 
tcagcggcga 
tcggcaacaa 
acagcttcaa 
cctggaacaa 
tcaagcagat 
gcggccagat 
aggagatcag 
ggcgcagcga 
ccaaggccaa 
tcctgggctt 
tgcaggcccg 
tcgaggccca 
gcgtgctggc 
gcggcaagct 



gctgtgtgga 
ctacggcgtg 
ggcctacgac 
caacccccag 
catggtggag 
cgtgaaggcc 
ccactactgc 
cagcggcccc 
gagcacccag 
gaacttcacc 
ctgcacccgc 
ctacgccacc 
gaagtggaac 
gaccatcgtg 
ctgcggcggc 
caccatcggc 
catcaaccgc 
ccgctgcagc 
caacaccacc 
gctgtacaag 
gcgccgcgtg 
cctgggcgcc 
ccagctgctg 
gcagcacctg 
cgtggagcgc 
gatctgcacc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 
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accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 162 0 
atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 
atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 174 0 
tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 
atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 
aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 
ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 1980 
agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg- 2040 
ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 
ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 
gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 2220 
accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 
cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg agcgtgct 2328 



<210> 7 

<211> 2334 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Leul22 -Serl99 
<400> 7 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 42 0 
tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 4 80 
ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 
acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 
ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 
gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 
tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 84 0 
atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 
ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 
atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 
aaccgctggc aggaggtggg caaggccatg tacgcccccc ccatccgcgg ccagatccgc 1080 
tgcagcagca acatcaccgg cctgctgctg acccgcgacg gcggcaagga gatcagcaac 1140 
accaccgaga tcttccgccc cggcggcggc gacatgcgcg acaactggcg cagcgagctg 1200 
tacaagtaca aggtggtgaa gatcgagccc ctgggcgtgg cccccaccaa ggccaagcgc 1260 
cgcgtggtgc agcgcgagaa gcgcgccgtg accctgggcg ccatgttcct gggcttcctg 1320 
ggcgccgccg gcagcaccat gggcgcccgc agcctgaccc tgaccgtgca ggcccgccag 13 80 
ctgctgagcg gcatcgtgca gcagcagaac aacctgctgc gcgccatcga ggcccagcag 1440 
cacctgctgc agctgaccgt gtggggcatc aagcagctgc aggcccgcgt gctggccgtg 1500 
gagcgctacc tgaaggacca gcagctgctg ggcatctggg gctgcagcgg caagctgatc 1560 
tgcaccaccg ccgtgccctg gaacgccagc tggagcaaca agagcctgga ccagatctgg 1620 
aacaacatga cctggatgga gtgggagcgc gagatcgaca actacaccaa cctgatctac 16 80 
accctgatcg aggagagcca gaaccagcag gagaagaacg agcaggagct gctggagctg 1740 
gacaagtggg ccagcctgtg gaactggttc gacatcagca agtggctgtg gtacatcaag 1800 
atcttcatca tgatcgtggg cggcctggtg ggcctgcgca tcgtgttcac cgtgctgagc 1860 
atcgtgaacc gcgtgcgcca gggctacagc cccctgagct tccagacccg cttccccgcc 1920 
ccccgcggcc ccgaccgccc cgagggcatc gaggaggagg gcggcgagcg cgaccgcgac 1980 
cgcagcagcc ccctggtgca cggcctgctg gccctgatct gggacgacct gcgcagcctg 204 0 
tgcctgttca gctaccaccg cctgcgcgac ctgatcctga tcgccgcccg catcgtggag 2100 
ctgctgggcc gccgcggctg ggaggccctg aagtactggg gcaacctgct gcagtactgg 2160 
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atccaggagc tgaagaacag cgccgtgagc ctgttcgacg ccatcgccat cgccgtggcc 2220 
gagggcaccg accgcatcat cgaggtggcc cagcgcatcg gccgcgcctt cctgcacatc 2280 
ccccgccgca tccgccaggg cttcgagcgc gccctgctgt aactcgagcg tgct 2334 



<210> 8 

<211> 2316 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Vall20-Thr202 
<400> 8 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
gccacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 4 80 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 54 0 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 84 0 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcaa ccgctggcag 1020 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatccgctg cagcagcaac 1080 
atcaccggcc tgctgctgac ccgcgacggc ggcaaggaga tcagcaacac caccgagatc 1140 
ttccgccccg gcggcggcga catgcgcgac aactggcgca gcgagctgta caagtacaag 1200 
gtggtgaaga tcgagcccct gggcgtggcc cccaccaagg ccaagcgccg cgtggtgcag 1260 
cgcgagaagc gcgccgtgac cctgggcgcc atgttcctgg gcttcctggg cgccgccggc 1320 
agcaccatgg gcgcccgcag cctgaccctg accgtgcagg cccgccagct gctgagcggc 13 80 
atcgtgcagc agcagaacaa cctgctgcgc gccatcgagg cccagcagca cctgctgcag 1440 
ctgaccgtgt ggggcatcaa gcagctgcag gcccgcgtgc tggccgtgga gcgctacctg 1500 
aaggaccagc agctgctggg catctggggc tgcagcggca agctgatctg caccaccgcc 1560 
gtgccctgga acgccagctg gagcaacaag agcctggacc agatctggaa caacatgacc 1620 
tggatggagt gggagcgcga gatcgacaac tacaccaacc tgatctacac cctgatcgag 1680 
gagagccaga accagcagga gaagaacgag caggagctgc tggagctgga caagtgggcc 1740 
agcctgtgga actggttcga catcagcaag tggctgtggt acatcaagat cttcatcatg 1800 
atcgtgggcg gcctggtggg cctgcgcatc gtgttcaccg tgctgagcat cgtgaaccgc 1860 
gtgcgccagg gctacagccc cctgagcttc cagacccgct tccccgcccc ccgcggcccc 1920 
gaccgccccg agggcatcga ggaggagggc ggcgagcgcg accgcgaccg cagcagcccc 1980 
ctggtgcacg gcctgctggc cctgatctgg gacgacctgc gcagcctgtg cctgttcagc 2040 
taccaccgcc tgcgcgacct gatcctgatc gccgcccgca tcgtggagct gctgggccgc 2100 
cgcggctggg aggccctgaa gtactggggc aacctgctgc agtactggat ccaggagctg 2160 
aagaacagcg ccgtgagcct gttcgacgcc atcgccatcg ccgtggccga gggcaccgac 2220 
cgcatcatcg aggtggccca gcgcatcggc cgcgccttcc tgcacatccc ccgccgcatc 2280 
cgccagggct tcgagcgcgc cctgctgtaa ctcgag 2316 



<210> 9 

<211> 2541 

<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence: Trp427 -Gly431 
<400> 9 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctgggg cggcaaggcc 1260 
atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 1320 
ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 1380 
ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 1440 
cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 
gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 
cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 162 0 
aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 
atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 174 0 
ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 
agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 
cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 1920 
caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 
ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 204 0 
gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 
agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 
atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 222 0 
ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 2280 
gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 2340 
ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 24 00 
agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 2460 
gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 2520 
cgcgccctgc tgtaactcga g 2541 

<210> 10 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Arg426-Gly43 1 
<400> 10 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
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accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgcgg 
acccccctgt gcgtgaccct gcactgcacc 
agcaactgga aggagatgga ccgcggcgag 
agcatccgca acaagatgca gaaggagtac 
atcgacaacg acaacaccag ctacaagctg 
gcctgcccca aggtgagctt cgagcccatc 
gccatcctga agtgcaacga caagaagttc 
accgtgcagt gcacccacgg catccgcccc 
agcctggccg aggagggcgt ggtgatccgc 
atcatcgtgc agctgaagga gagcgtggag 
cgcaagagca tcaccatcgg ccccggccgc 
gacatccgcc aggcccactg caacatcagc 
atcgtgacca agctgcaggc ccagttcggc 
ggcggcgacc ccgagatcgt gatgcacagc 
aacagcaccc agctgttcaa cagcacctgg 
ggcaccatca ccctgccctg ccgcatcaag 
atgtacgccc cccccatccg cggccagatc 
ctgacccgcg acggcggcaa ggagatcagc 
ggcgacatgc gcgacaactg gcgcagcgag 
cccctgggcg tggcccccac caaggccaag 
gtgaccctgg gcgccatgtt cctgggcttc 
cgcagcctga ccctgaccgt gcaggcccgc 
aacaacctgc tgcgcgccat cgaggcccag 
atcaagcagc tgcaggcccg cgtgctggcc 
ctgggcatct ggggctgcag cggcaagctg 
agctggagca acaagagcct ggaccagatc 
cgcgagatcg acaactacac caacctgatc 
caggagaaga acgagcagga gctgctggag 
ttcgacatca gcaagtggct gtggtacatc 
gtgggcctgc gcatcgtgtt caccgtgctg 
agccccctga gcttccagac ccgcttcccc 
atcgaggagg agggcggcga gcgcgaccgc 
ctggccctga tctgggacga cctgcgcagc 
gacctgatcc tgatcgccgc ccgcatcgtg 
ctgaagtact ggggcaacct gctgcagtac 
agcctgttcg acgccatcgc catcgccgtg 
gcccagcgca tcggccgcgc cttcctgcac 
cgcgccctgc tgtaactcga g 
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gcctgcgtgc ccaccgaccc caacccccag 24 0 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgaagctg 3 60 
aacctgaaga acgccaccaa caccaagagc 420 
atcaagaact gcagcttcaa ggtgaccacc 480 
gccctgttct acaagctgga cgtggtgccc 540 
atcaactgca acaccagcgt gatcacccag 600 
cccatccact actgcgcccc cgccggcttc 660 
aacggcagcg gcccctgcac caacgtgagc 720 
gtggtgagca cccagctgct gctgaacggc 780 
agcgagaact tcaccgacaa cgccaagacc 84 0 
atcaactgca cccgccccaa caacaacacc 900 
gccttctacg ccaccggcga catcatcggc 960 
ggcgagaagt ggaacaacac cctgaagcag 1020 
aacaagacca tcgtgttcaa gcagagcagc 1080 
ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacaacacca tcggccccaa caacaccaac 1200 
cagatcatca accgcggcgg cggcaaggcc 1260 
cgctgcagca gcaacatcac cggcctgctg 132 0 
aacaccaccg agatcttccg ccccggcggc 1380 
ctgtacaagt acaaggtggt gaagatcgag 144 0 
cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 
ctgggcgccg ccggcagcac catgggcgcc 1560 
cagctgctga gcggcatcgt gcagcagcag 162 0 
cagcacctgc tgcagctgac cgtgtggggc 1680 
gtggagcgct acctgaagga ccagcagctg 174 0 
atctgcacca ccgccgtgcc ctggaacgcc 1800 
tggaacaaca tgacctggat ggagtgggag 1860 
tacaccctga tcgaggagag ccagaaccag 192 0 
ctggacaagt gggccagcct gtggaactgg 1980 
aagatcttca tcatgatcgt gggcggcctg 2040 
agcatcgtga accgcgtgcg ccagggctac 2100 
gccccccgcg gccccgaccg ccccgagggc 2160 
gaccgcagca gccccctggt gcacggcctg 2220 
ctgtgcctgt tcagctacca ccgcctgcgc 2280 
gagctgctgg gccgccgcgg ctgggaggcc 234 0 
tggatccagg agctgaagaa cagcgccgtg 2400 
gccgagggca ccgaccgcat catcgaggtg 2460 
atcccccgcc gcatccgcca gggcttcgag 2520 

2541 



<210> 11 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Arg426 -Gly4 3 IB 



<400> 11 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
acccccctgt gcgtgaccct gcactgcacc 
agcaactgga aggagatgga ccgcggcgag 
agcatccgca acaagatgca gaaggagtac 



gggctctgct gtgtgctgct gctgtgtgga 60 

aagctgtggg tgaccgtgta ctacggcgtg 120 

ttctgcgcca gcgacgccaa ggcctacgac 180 

gcctgcgtgc ccaccgaccc caacccccag 240 

ttcaacatgt ggaagaacaa catggtggag 300 

gaccagagcc tgaagccctg cgtgaagctg 360 

aacctgaaga acgccaccaa caccaagagc 420 

atcaagaact gcagcttcaa ggtgaccacc 480 

gccctgttct acaagctgga cgtggtgccc 540 
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atcgacaacg 
gcctgcccca 
gccatcctga 
accgtgcagt 
agcctggccg 
atcatcgtgc 
cgcaagagca 
gacatccgcc 
atcgtgacca 
ggcggcgacc 
aacagcaccc 
ggcaccatca 
atgtacgccc 
ctgacccgcg 
ggcgacatgc 
cccctgggcg 
gtgaccctgg 
cgcagcctga 
aacaacctgc 
atcaagcagc 
ctgggcatct 
agctggagca 
cgcgagatcg 
caggagaaga 
ttcgacatca 
gtgggcctgc 
agccccctga 
atcgaggagg 
ctggccctga 
gacctgatcc 
ctgaagtact 
agcctgttcg 
gcccagcgca 
cgcgccctgc 



acaacaccag 
aggtgagctt 
agtgcaacga 
gcacccacgg 
aggagggcgt 
agctgaagga 
tcaccatcgg 
aggcccactg 
agctgcaggc 
ccgagatcgt 
agctgttcaa 
ccctgccctg 
cccccatccg 
acggcggcaa 
gcgacaactg 
tggcccccac 
gcgccatgtt 
ccctgaccgt 
tgcgcgccat 
tgcaggcccg 
ggggctgcag 
acaagagcct 
acaactacac 
acgagcagga 
gcaagtggct 
gcatcgtgtt 
gcttccagac 
agggcggcga 
tctgggacga 
tgatcgccgc 
ggggcaacct 
acgccatcgc 
tcggccgcgc 
tgtaactcga 



ctacaagctg 
cgagcccatc 
caagaagttc 
catccgcccc 
ggtgatccgc 
gagcgtggag 
ccccggccgc 
caacatcagc 
ccagttcggc 
gatgcacagc 
cagcacctgg 
ccgcatcaag 
cggccagatc 
ggagatcagc 
gcgcagcgag 
caaggccaag 
cctgggcttc 
gcaggcccgc 
cgaggcccag 
cgtgctggcc 
cggcaagctg 
ggaccagatc 
caacctgatc 
gctgctggag 
gtggtacatc 
caccgtgctg 
ccgcttcccc 
gcgcgaccgc 
cctgcgcagc 
ccgcatcgtg 
gctgcagtac 
catcgccgtg 
cttcctgcac 
9 



atcaactgca 
cccatccact 
aacggcagcg 
gtggtgagca 
agcgagaact 
atcaactgca 
gccttctacg 
ggcgagaagt 
aacaagacca 
ttcaactgcg 
aacaacacca 
cagatcatca 
cgctgcagca 
aacaccaccg 
ctgtacaagt 
cgccgcgtgg 
ctgggcgccg 
cagctgctga 
cagcacctgc 
gtggagcgct 
atctgcacca 
tggaacaaca 
tacaccctga 
ctggacaagt 
aagatcttca 
agcatcgtga 
gccccccgcg 
gaccgcagca 
ctgtgcctgt 
gagctgctgg 
tggatccagg 
gccgagggca 
atcccccgcc 



acaccagcgt 
actgcgcccc 
gcccctgcac 
cccagctgct 
tcaccgacaa 
cccgccccaa 
ccaccggcga 
ggaacaacac 
tcgtgttcaa 
gcggcgagtt 
tcggccccaa 
accgcggcag 
gcaacatcac 
agatcttccg 
acaaggtggt 
tgcagcgcga 
ccggcagcac 
gcggcatcgt 
tgcagctgac 
acctgaagga 
ccgccgtgcc 
tgacctggat 
tcgaggagag 
gggccagcct 
tcatgatcgt 
accgcgtgcg 
gccccgaccg 
gccccctggt 
tcagctacca 
gccgccgcgg 
agctgaagaa 
ccgaccgcat 
gcatccgcca 



gatcacccag 
cgccggcttc 
caacgtgagc 
gctgaacggc 
cgccaagacc 
caacaacacc 
catcatcggc 
cctgaagcag 
gcagagcagc 
cttctactgc 
caacaccaac 
cggcaaggcc 
cggcctgctg 
ccccggcggc 
gaagatcgag 
gaagcgcgcc 
catgggcgcc 
gcagcagcag 
cgtgtggggc 
ccagcagctg 
ctggaacgcc 
ggagtgggag 
ccagaaccag 
gtggaactgg 
gggcggcctg 
ccagggctac 
ccccgagggc 
gcacggcctg 
ccgcctgcgc 
ctgggaggcc 
cagcgccgtg 
catcgaggtg 
gggcttcgag 



600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

13B0 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2541 



<210> 12 

<211> 2541 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Arg426 -Lys4 32 



<400> 12 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

acccccctgt 

agcaactgga 

agcatccgca 

atcgacaacg 

gcctgcccca 

gccatcctga 

accgtgcagt 

agcctggccg 

atcatcgtgc 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
gcgtgaccct 
aggagatgga 
acaagatgca 
acaacaccag 
aggtgagctt 
agtgcaacga 
gcacccacgg 

aggagggcgt 

agctgaagga 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
gcactgcacc 
ccgcggcgag 
gaaggagtac 
ctacaagctg 
cgagcccatc 
caagaagttc 
catccgcccc 
ggtgatccgc 
gagcgtggag 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
aacctgaaga 
atcaagaact 
gccctgttct 
atcaactgca 
cccatccact 
aacggcagcg 
gtggtgagca 
agcgagaact 
atcaactgca 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
acgccaccaa 
gcagcttcaa 
acaagctgga 
acaccagcgt 
actgcgcccc 
gcccctgcac 
cccagctgct 
tcaccgacaa 
cccgccccaa 



gctgtgtgga 
ctacggcgtg 
ggcctacgac 
caacccccag 
catggtggag 
cgtgaagctg 
caccaagagc 
ggtgaccacc 
cgtggtgccc 
gatcacccag 
cgccggcttc 
caacgtgagc 
gctgaacggc 
cgccaagacc 
caacaacacc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 
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cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcaccggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgcggcgg caacaaggcc 1260 
atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 1320 
ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 13 80 
ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 144 0 
cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 
gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 
cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1620 
aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 
atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 174 0 
ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 
agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 
cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 1920 
caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 
ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 2040 
gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 
agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 
atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 2220 
ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 2280 
gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 2340 
ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 24 00 
agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 24 60 
gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 2520 
cgcgccctgc tgtaactcga g 2541 



<210> 13 

<211> 2535 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Asn425-Lys432 
<400> 13 

gaattcgcca ccatggacgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca acgcccccaa ggccatgtac 1260 
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gcccccccca 
cgcgacggcg 
atgcgcgaca 
ggcgtggccc 
ctgggcgcca 
ctgaccctga 
ctgctgcgcg 
cagctgcagg 
atctggggct 
agcaacaaga 
atcgacaact 
aagaacgagc 
atcagcaagt 
ctgcgcatcg 
ctgagcttcc 
gaggagggcg 
ctgatctggg 
atcctgatcg 
tactggggca 
ttcgacgcca 
cgcatcggcc 
ctgctgtaac 



tccgcggcca 
gcaaggagat 
actggcgcag 
ccaccaaggc 
tgttcctggg 
ccgtgcaggc 
ccatcgaggc 
cccgcgtgct 
gcagcggcaa 
gcctggacca 
acaccaacct 
aggagctgct 
ggctgtggta 
tgttcaccgt 
agacccgctt 
gcgagcgcga 
acgacctgcg 
ccgcccgcat 
acctgctgca 
tcgccatcgc 
gcgccttcct 
tcgag 



gatccgctgc 
cagcaacacc 
cgagctgtac 
caagcgccgc 
cttcctgggc 
ccgccagctg 
ccagcagcac 
ggccgtggag 
gctgatctgc 
gatctggaac 
gatctacacc 
ggagctggac 
catcaagatc 
gctgagcatc 
ccccgccccc 
ccgcgaccgc 
cagcctgtgc 
cgtggagctg 
gtactggatc 
cgtggccgag 
gcacatcccc 



agcagcaaca 
accgagatct 
aagtacaagg 
gtggtgcagc 
gccgccggca 
ctgagcggca 
ctgctgcagc 
cgctacctga 
accaccgccg 
aacatgacct 
ctgatcgagg 
aagtgggcca 
ttcatcatga 
gtgaaccgcg 
cgcggccccg 
agcagccccc 
ctgttcagct 
ctgggccgcc 
caggagctga 
ggcaccgacc 
cgccgcatcc 



tcaccggcct 
tccgccccgg 
tggtgaagat 
gcgagaagcg 
gcaccatggg 
tcgtgcagca 
tgaccgtgtg 
aggaccagca 
tgccctggaa 
ggatggagtg 
agagccagaa 
gcctgtggaa 
tcgtgggcgg 
tgcgccaggg 
accgccccga 
tggtgcacgg 
accaccgcct 
gcggctggga 
agaacagcgc 
gcatcatcga 
gccagggctt 



gctgctgacc 
cggcggcgac 
cgagcccctg 
cgccgtgacc 
cgcccgcagc 
gcagaacaac 
gggcatcaag 
gctgctgggc 
cgccagctgg 
ggagcgcgag 
ccagcaggag 
ctggttcgac 
cctggtgggc 
ctacagcccc 
gggcatcgag 
cctgctggcc 
gcgcgacctg 
ggccctgaag 
cgtgagcctg 
ggtggcccag 
cgagcgcgcc 



1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2535 



<210> 14 

<211> 2529 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Ile424 -Ala433 



<400> 14 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

acccccctgt 

agcaactgga 

agcatccgca 

atcgacaacg 

gcctgcccca 

gccatcctga 

accgtgcagt 

agcctggccg 

atcatcgtgc 

cgcaagagca 

gacatccgcc 

atcgtgacca 

ggcggcgacc 

aacagcaccc 

ggcaccatca 

cccatccgcg 

ggcggcaagg 

gacaactggc 

gcccccacca 

gccatgttcc 

ctgaccgtgc 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
gcgtgaccct 
aggagatgga 
acaagatgca 
acaacaccag 
aggtgagctt 
agtgcaacga 
gcacccacgg 
aggagggcgt 
agctgaagga 
tcaccatcgg 
aggcccactg 
agctgcaggc 
ccgagatcgt 
agctgttcaa 
ccctgccctg 
gccagatccg 
agatcagcaa 
gcagcgagct 
aggccaagcg 
tgggcttcct 
aggcccgcca 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
gcactgcacc 
ccgcggcgag 
gaaggagtac 
ctacaagctg 
cgagcccatc 
caagaagttc 
catccgcccc 
ggtgatccgc 
gagcgtggag 
ccccggccgc 
caacatcagc 
ccagttcggc 
gatgcacagc 
cagcacctgg 
ccgcatcaag 
ctgcagcagc 
caccaccgag 
gtacaagtac 
ccgcgtggtg 
gggcgccgcc 
gctgctgagc 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
aacctgaaga 
atcaagaact 
gccctgttct 
atcaactgca 
cccatccact 
aacggcagcg 
gtggtgagca 
agcgagaact 
atcaactgca 
gccttctacg 
ggcgagaagt 
aacaagacca 
ttcaactgcg 
aacaacacca 
cagatcatcg 
aacatcaccg 
atcttccgcc 
aaggtggtga 
cagcgcgaga 
ggcagcacca 
ggcatcgtgc 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
acgccaccaa 
gcagcttcaa 
acaagctgga 
acaccagcgt 
actgcgcccc 
gcccctgcac 
cccagctgct 
tcaccgacaa 
cccgccccaa 
ccaccggcga 
ggaacaacac 
tcgtgttcaa 
gcggcgagtt 
tcggccccaa 
gcggcgccat 
gcctgctgct 
ccggcggcgg 
agatcgagcc 
agcgcgccgt 
tgggcgcccg 
agcagcagaa 



gctgtgtgga 
ctacggcgtg 
ggcctacgac 
caacccccag 
catggtggag 
cgtgaagctg 
caccaagagc 
ggtgaccacc 
cgtggtgccc 
gatcacccag 
cgccggcttc 
caacgtgagc 
gctgaacggc 
cgccaagacc 
caacaacacc 
catcatcggc 
cctgaagcag 
gcagagcagc 
cttctactgc 
caacaccaac 
gtacgccccc 
gacccgcgac 
cgacatgcgc 
cctgggcgtg 
gaccctgggc 
cagcctgacc 
caacctgctg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 
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cgcgccatcg 
caggcccgcg 
ggctgcagcg 
aagagcctgg 
aactacacca 
gagcaggagc 
aagtggctgt 
atcgtgttca 
ttccagaccc 
ggcggcgagc 
tgggacgacc 
atcgccgccc 
ggcaacctgc 
gccatcgcca 
ggccgcgcct 
taactcgag 



aggcccagca 
tgctggccgt 
gcaagctgat 
accagatctg 
acctgatcta 
tgctggagct 
ggtacatcaa 
ccgtgctgag 
gcttccccgc 
gcgaccgcga 
tgcgcagcct 
gcatcgtgga 
tgcagtactg 
tcgccgtggc 
tcctgcacat 



gcacctgctg 
ggagcgctac 
ctgcaccacc 
gaacaacatg 
caccctgatc 
ggacaagtgg 
gatcttcatc 
catcgtgaac 
cccccgcggc 
ccgcagcagc 
gtgcctgttc 
gctgctgggc 
gatccaggag 
cgagggcacc 
cccccgccgc 



cagctgaccg 
ctgaaggacc 
gccgtgccct 
acctggatgg 
gaggagagcc 
gccagcctgt 
atgatcgtgg 
cgcgtgcgcc 
cccgaccgcc 
cccctggtgc 
agctaccacc 
cgccgcggct 
ctgaagaaca 
gaccgcatca 
atccgccagg 



tgtggggcat 
agcagctgct 
ggaacgccag 
agtgggagcg 
agaaccagca 
ggaactggtt 
gcggcctggt 
agggctacag 
ccgagggcat 
acggcctgct 
gcctgcgcga 
gggaggccct 
gcgccgtgag 
tcgaggtggc 
gcttcgagcg 



caagcagctg 
gggcatctgg 
ctggagcaac 
cgagatcgac 
ggagaagaac 
cgacatcagc 
gggcctgcgc 
ccccctgagc 
cgaggaggag 
ggccctgatc 
cctgatcctg 
gaagtactgg 
cctgttcgac 
ccagcgcatc 
cgccctgctg 



1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2529 



<210> 15 
<211> 2523 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Ile423 -Met434 



<400> 15 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

acccccctgt 

agcaactgga 

agcatccgca 

atcgacaacg 

gcctgcccca 

gccatcctga 

accgtgcagt 

agcctggccg 

atcatcgtgc 

cgcaagagca 

gacatccgcc 

atcgtgacca 

ggcggcgacc 

aacagcaccc 

ggcaccatca 

cgcggccaga 

aaggagatca 

tggcgcagcg 

accaaggcca 

ttcctgggct 

gtgcaggccc 

atcgaggccc 

cgcgtgctgg 

agcggcaagc 

ctggaccaga 

accaacctga 

gagctgctgg 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
gcgtgaccct 
aggagatgga 
acaagatgca 
acaacaccag 
aggtgagctt 
agtgcaacga 
gcacccacgg 
aggagggcgt 
agctgaagga 
tcaccatcgg 
aggcccactg 
agctgcaggc 
ccgagatcgt 
agctgttcaa 
ccctgccctg 
tccgctgcag 
gcaacaccac 
agctgtacaa 
agcgccgcgt 
tcctgggcgc 
gccagctgct 
agcagcacct 
ccgtggagcg 
tgatctgcac 
tctggaacaa 
tctacaccct 
agctggacaa 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
gcactgcacc 
ccgcggcgag 
gaaggagtac 
ctacaagctg 
cgagcccatc 
caagaagttc 
catccgcccc 
ggtgatccgc 
gagcgtggag 
ccccggccgc 
caacatcagc 
ccagttcggc 
gatgcacagc 
cagcacctgg 
ccgcatcaag 
cagcaacatc 
cgagatcttc 
gtacaaggtg 
ggtgcagcgc 
cgccggcagc 
gagcggcatc 
gctgcagctg 
ctacctgaag 
caccgccgtg 
catgacctgg 
gatcgaggag 
gtgggccagc 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
aacctgaaga 
atcaagaact 
gccctgttct 
atcaactgca 
cccatccact 
aacggcagcg 
gtggtgagca 
agcgagaact 
atcaactgca 
gccttctacg 
ggcgagaagt 
aacaagacca 
ttcaactgcg 
aacaacacca 
cagatcggcg 
accggcctgc 
cgccccggcg 
gtgaagatcg 
gagaagcgcg 
accatgggcg 
gtgcagcagc 
accgtgtggg 
gaccagcagc 
ccctggaacg 
atggagtggg 
agccagaacc 
ctgtggaact 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
acgccaccaa 
gcagcttcaa 
acaagctgga 
acaccagcgt 
actgcgcccc 
gcccctgcac 
cccagctgct 
tcaccgacaa 
cccgccccaa 
ccaccggcga 
ggaacaacac 
tcgtgttcaa 
gcggcgagtt 
tcggccccaa 
gcatgtacgc 
tgctgacccg 
gcggcgacat 
agcccctggg 
ccgtgaccct 
cccgcagcct 
agaacaacct 
gcatcaagca 
tgctgggcat 
ccagctggag 
agcgcgagat 
agcaggagaa 
ggttcgacat 



gctgtgtgga 
ctacggcgtg 
ggcctacgac 
caacccccag 
catggtggag 
cgtgaagctg 
caccaagagc 
ggtgaccacc 
cgtggtgccc 
gatcacccag 
cgccggcttc 
caacgtgagc 
gctgaacggc 
cgccaagacc 
caacaacacc 
catcatcggc 
cctgaagcag 
gcagagcagc 
cttctactgc 
caacaccaac 
cccccccatc 
cgacggcggc 
gcgcgacaac 
cgtggccccc 
gggcgccatg 
gaccctgacc 
gctgcgcgcc 
gctgcaggcc 
ctggggctgc 
caacaagagc 
cgacaactac 
gaacgagcag 
cagcaagtgg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

84 0 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 
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ctgtggtaca 
ttcaccgtgc 
acccgcttcc 
gagcgcgacc 
gacctgcgca 
gcccgcatcg 
ctgctgcagt 
gccatcgccg 
gccttcctgc 
gag 



tcaagatctt 
tgagcatcgt 
ccgccccccg 
gcgaccgcag 
gcctgtgcct 
tggagctgct 
actggatcca 
tggccgaggg 
acatcccccg 



catcatgatc 
gaaccgcgtg 
cggccccgac 
cagccccctg 
gttcagctac 
gggccgccgc 
ggagctgaag 
caccgaccgc 
ccgcatccgc 



gtgggcggcc 
cgccagggct 
cgccccgagg 
gtgcacggcc 
caccgcctgc 
ggctgggagg 
aacagcgccg 
atcatcgagg 
cagggcttcg 



tggtgggcct 
acagccccct 
gcatcgagga 
tgctggccct 
gcgacctgat 
ccctgaagta 
tgagcctgtt 
tggcccagcg 
agcgcgccct 



gcgcatcgtg 
gagcttccag 
ggagggcggc 
gatctgggac 
cctgatcgcc 
ctggggcaac 
cgacgccatc 
catcggccgc 
gctgtaactc 



<210> 16 

<211> 2517 

<212> DNA 

<213> Artificial Sequence 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2523 



<220> 

<223> Description of Artificial Sequence: Gln422-Tyr435 



<400> 16 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

acccccctgt 

agcaactgga 

agcatccgca 

atcgacaacg 

gcctgcccca 

gccatcctga 

accgtgcagt 

agcctggccg 

atcatcgtgc 

cgcaagagca 

gacatccgcc 

atcgtgacca 

ggcggcgacc 

aacagcaccc 

ggcaccatca 

cagatccgct 

atcagcaaca 

agcgagctgt 

gccaagcgcc 

ggcttcctgg 

gcccgccagc 

gcccagcagc 

ctggccgtgg 

aagctgatct 

cagatctgga 

ctgatctaca 

ctggagctgg 

tacatcaaga 

gtgctgagca 

ttccccgccc 

gaccgcgacc 

cgcagcctgt 

atcgtggagc 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
gcgtgaccct 
aggagatgga 
acaagatgca 
acaacaccag 
aggtgagctt 
agtgcaacga 
gcacccacgg 
aggagggcgt 
agctgaagga 
tcaccatcgg 
aggcccactg 
agctgcaggc 
ccgagatcgt 
agctgttcaa 
ccctgccctg 
gcagcagcaa 
ccaccgagat 
acaagtacaa 
gcgtggtgca 
gcgccgccgg 
tgctgagcgg 
acctgctgca 
agcgctacct 
gcaccaccgc 
acaacatgac 
ccctgatcga 
acaagtgggc 
tcttcatcat 
tcgtgaaccg 
cccgcggccc 
gcagcagccc 
gcctgttcag 
tgctgggccg 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
gcactgcacc 
ccgcggcgag 
gaaggagtac 
ctacaagctg 
cgagcccatc 
caagaagttc 
catccgcccc 
ggtgatccgc 
gagcgtggag 
ccccggccgc 
caacatcagc 
ccagttcggc 
gatgcacagc 
cagcacctgg 
ccgcatcaag 
catcaccggc 
cttccgcccc 
ggtggtgaag 
gcgcgagaag 
cagcaccatg 
catcgtgcag 
gctgaccgtg 
gaaggaccag 
cgtgccctgg 
ctggatggag 
ggagagccag 
cagcctgtgg 
gatcgtgggc 
cgtgcgccag 
cgaccgcccc 
cctggtgcac 
ctaccaccgc 
ccgcggctgg 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
aacctgaaga 
atcaagaact 
gccctgttct 
atcaactgca 
cccatccact 
aacggcagcg 
gtggtgagca 
agcgagaact 
atcaactgca 
gccttctacg 
ggcgagaagt 
aacaagacca 
ttcaactgcg 
aacaacacca 
cagggcggct 
ctgctgctga 
ggcggcggcg 
atcgagcccc 
cgcgccgtga 
ggcgcccgca 
cagcagaaca 
tggggcatca 
cagctgctgg 
aacgccagct 
tgggagcgcg 
aaccagcagg 
aactggttcg 
ggcctggtgg 
ggctacagcc 
gagggcatcg 
ggcctgctgg 
ctgcgcgacc 
gaggccctga 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
acgccaccaa 
gcagcttcaa 
acaagctgga 
acaccagcgt 
actgcgcccc 
gcccctgcac 
cccagctgct 
tcaccgacaa 
cccgccccaa 
ccaccggcga 
ggaacaacac 
tcgtgttcaa 
gcggcgagtt 
tcggccccaa 
acgccccccc 
cccgcgacgg 
acatgcgcga 
tgggcgtggc 
ccctgggcgc 
gcctgaccct 
acctgctgcg 
agcagctgca 
gcatctgggg 
ggagcaacaa 
agatcgacaa 
agaagaacga 
acatcagcaa 
gcctgcgcat 
ccctgagctt 
aggaggaggg 
ccctgatctg 
tgatcctgat 
agtactgggg 



gctgtgtgga 
ctacggcgtg 
ggcctacgac 
caacccccag 
catggtggag 
cgtgaagctg 
caccaagagc 
ggtgaccacc 
cgtggtgccc 
gatcacccag 
cgccggcttc 
caacgtgagc 
gctgaacggc 
cgccaagacc 
caacaacacc 
catcatcggc 
cctgaagcag 
gcagagcagc 
cttctactgc 
caacaccaac 
catccgcggc 
cggcaaggag 
caactggcgc 
ccccaccaag 
catgttcctg 
gaccgtgcag 
cgccatcgag 
ggcccgcgtg 
ctgcagcggc 
gagcctggac 
ctacaccaac 
gcaggagctg 
gtggctgtgg 
cgtgttcacc 
ccagacccgc 
cggcgagcgc 
ggacgacctg 
cgccgcccgc 
caacctgctg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 
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cagtactgga tccaggagct gaagaacagc gccgtgagcc tgttcgacgc catcgccatc 24 00 
gccgtggccg agggcaccga ccgcatcatc gaggtggccc agcgcatcgg ccgcgccttc 2460 
ctgcacatcc cccgccgcat ccgccagggc ttcgagcgcg ccctgctgta actcgag 2517 

<210> 17 
. <211> 2517 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Gln422-Tyr435B 
<400> 17 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag caggccccct acgccccccc catccgcggc 1260 
cagatccgct gcagcagcaa catcaccggc ctgctgctga cccgcgacgg cggcaaggag 1320 
atcagcaaca ccaccgagat cttccgcccc ggcggcggcg acatgcgcga caactggcgc 13 80 
agcgagctgt acaagtacaa ggtggtgaag atcgagcccc tgggcgtggc ccccaccaag 144 0 
gccaagcgcc gcgtggtgca gcgcgagaag cgcgccgtga ccctgggcgc catgttcctg 1500 
ggcttcctgg gcgccgccgg cagcaccatg ggcgcccgca gcctgaccct gaccgtgcag 1560 
gcccgccagc tgctgagcgg catcgtgcag cagcagaaca acctgctgcg cgccatcgag 1620 
gcccagcagc acctgctgca gctgaccgtg tggggcatca agcagctgca ggcccgcgtg 1680 
ctggccgtgg agcgctacct gaaggaccag cagctgctgg gcatctgggg ctgcagcggc 174 0 
aagctgatct gcaccaccgc cgtgccctgg aacgccagct ggagcaacaa gagcctggac 1800 
cagatctgga acaacatgac ctggatggag tgggagcgcg agatcgacaa ctacaccaac 1860 
ctgatctaca ccctgatcga ggagagccag aaccagcagg agaagaacga gcaggagctg 1920 
ctggagctgg acaagtgggc cagcctgtgg aactggttcg acatcagcaa gtggctgtgg 1980 
tacatcaaga tcttcatcat gatcgtgggc ggcctggtgg gcctgcgcat cgtgttcacc 2040 
gtgctgagca tcgtgaaccg cgtgcgccag ggctacagcc ccctgagctt ccagacccgc 2100 
ttccccgccc cccgcggccc cgaccgcccc gagggcatcg aggaggaggg cggcgagcgc 2160 
gaccgcgacc gcagcagccc cctggtgcac ggcctgctgg ccctgatctg ggacgacctg 2220 
cgcagcctgt gcctgttcag ctaccaccgc ctgcgcgacc tgatcctgat cgccgcccgc 2280 
atcgtggagc tgctgggccg ccgcggctgg gaggccctga agtactgggg caacctgctg 2340 
cagtactgga tccaggagct gaagaacagc gccgtgagcc tgttcgacgc catcgccatc 2400 
gccgtggccg agggcaccga ccgcatcatc gaggtggccc agcgcatcgg ccgcgccttc 2460 
ctgcacatcc cccgccgcat ccgccagggc ttcgagcgcg ccctgctgta actcgag 2517 

<210> 18 
<211> 2322 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Leul22-Serl99; 
Arg426-Gly431 

<400> 18 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 420 
•tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 480 
ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 
acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 
ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 
gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 
tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 84 0 
atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 
ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 
atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 102 0 
aaccgcggcg gcggcaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 1080 
agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 114 0 
gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 12 00 
tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 1260 
gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 132 0 
gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 13 80 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 144 0 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 
accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 
atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 
atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 174 0 
tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 
atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 
aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 
ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 198 0 
agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 204 0 
ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 
ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 216 0 
gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 222 0 
accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 
cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg ag 2322 

<210> 19 

<211> 2322 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Leul22 -Serl99 ; 
Arg42 6-Lys4 32 

<400> 19 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
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cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgcg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
ggcaacagcg tgatcaccca ggcctgcccc 
tactgcgccc ccgccggctt cgccatcctg 
ggcccctgca ccaacgtgag caccgtgcag 
acccagctgc tgctgaacgg cagcctggcc 
ttcaccgaca acgccaagac catcatcgtg 
acccgcccca acaacaacac ccgcaagagc 
gccaccggcg acatcatcgg cgacatccgc 
tggaacaaca ccctgaagca gatcgtgacc 
atcgtgttca agcagagcag cggcggcgac 
ggcggcgagt tcttctactg caacagcacc 
atcggcccca acaacaccaa cggcaccatc 
aaccgcggcg gcaacaaggc catgtacgcc 
agcaacatca ccggcctgct gctgacccgc 
gagatcttcc gccccggcgg cggcgacatg 
tacaaggtgg tgaagatcga gcccctgggc 
gtgcagcgcg agaagcgcgc cgtgaccctg 
gccggcagca ccatgggcgc ccgcagcctg 
agcggcatcg tgcagcagca gaacaacctg 
ctgcagctga ccgtgtgggg catcaagcag 
tacctgaagg accagcagct gctgggcatc 
accgccgtgc cctggaacgc cagctggagc 
atgacctgga tggagtggga gcgcgagatc 
atcgaggaga gccagaacca gcaggagaag 
tgggccagcc tgtggaactg gttcgacatc 
atcatgatcg tgggcggcct ggtgggcctg 
aaccgcgtgc gccagggcta cagccccctg 
ggccccgacc gccccgaggg catcgaggag 
agccccctgg tgcacggcct gctggccctg 
ttcagctacc accgcctgcg cgacctgatc 
ggccgccgcg gctgggaggc cctgaagtac 
gagctgaaga acagcgccgt gagcctgttc 
accgaccgca tcatcgaggt ggcccagcgc 
cgcatccgcc agggcttcga gcgcgccctg 



ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgaagctg 360 
aaggtgagct tcgagcccat ccccatccac 420 
aagtgcaacg acaagaagtt caacggcagc 480 
tgcacccacg gcatccgccc cgtggtgagc 540 
gaggagggcg tggtgatccg cagcgagaac 600 
cagctgaagg agagcgtgga gatcaactgc 660 
accaccatcg gccccggccg cgccttctac 720 
caggcccact gcaacatcag cggcgagaag 780 
aagctgcagg cccagttcgg caacaagacc 84 0 
cccgagatcg tgatgcacag cttcaactgc 900 
cagctgttca acagcacctg gaacaacacc 960 
accctgccct gccgcatcaa gcagatcatc 1020 
ccccccatcc gcggccagat ccgctgcagc 1080 
gacggcggca aggagatcag caacaccacc 114 0 
cgcgacaact ggcgcagcga gctgtacaag 1200 
gtggccccca ccaaggccaa gcgccgcgtg 1260 
ggcgccatgt tcctgggctt cctgggcgcc 1320 
accctgaccg tgcaggcccg ccagctgctg 1380 
ctgcgcgcca tcgaggccca gcagcacctg 144 0 
ctgcaggccc gcgtgctggc cgtggagcgc 1500 
tggggctgca gcggcaagct gatctgcacc 1560 
aacaagagcc tggaccagat ctggaacaac 162 0 
gacaactaca ccaacctgat ctacaccctg 1680 
aacgagcagg agctgctgga gctggacaag 1740 
agcaagtggc tgtggtacat caagatcttc 1800 
cgcatcgtgt tcaccgtgct gagcatcgtg 1860 
agcttccaga cccgcttccc cgccccccgc 1920 
gagggcggcg agcgcgaccg cgaccgcagc 1980 
atctgggacg acctgcgcag cctgtgcctg 2040 
ctgatcgccg cccgcatcgt ggagctgctg 2100 
tggggcaacc tgctgcagta ctggatccag 2160 
gacgccatcg ccatcgccgt ggccgagggc 2220 
atcggccgcg ccttcctgca catcccccgc 2280 
ctgtaactcg ag 2322 



<210> 20 

<211> 2322 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Leul22 -Serl99 ; 
Trp427-Gly431 



<400> 20 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

ggcaacagcg 

tactgcgccc 

ggcccctgca 

acccagctgc 

ttcaccgaca 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
tgatcaccca 
ccgccggctt 
ccaacgtgag 
tgctgaacgg 
acgccaagac 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
ggcctgcccc 
cgccatcctg 
caccgtgcag 
cagcctggcc 
catcatcgtg 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
aaggtgagct 
aagtgcaacg 
tgcacccacg 
gaggagggcg 
cagctgaagg 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
tcgagcccat 
acaagaagtt 
gcatccgccc 
tggtgatccg 
agagcgtgga 



gctgtgtgga 60 
ctacggcgtg 120 
ggcctacgac 180 
caacccccag 240 
catggtggag 300 
cgtgaagctg 360 
ccccatccac 420 
caacggcagc 480 
cgtggtgagc 540 
cagcgagaac 600 
gatcaactgc 660 
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acccgcccca acaacaacac ccgcaagagc 
gccaccggcg acatcatcgg cgacatccgc 
tggaacaaca ccctgaagca gatcgtgacc 
atcgtgttca agcagagcag cggcggcgac 
ggcggcgagt tcttctactg caacagcacc 
atcggcccca acaacaccaa cggcaccatc 
aaccgctggg gcggcaaggc catgtacgcc 
agcaacatca ccggcctgct gctgacccgc 
gagatcttcc gccccggcgg cggcgacatg 
tacaaggtgg tgaagatcga gcccctgggc 
gtgcagcgcg agaagcgcgc cgtgaccctg 
gccggcagca ccatgggcgc ccgcagcctg 
agcggcatcg tgcagcagca gaacaacctg 
ctgcagctga ccgtgtgggg catcaagcag 
tacctgaagg accagcagct gctgggcatc 
accgccgtgc cctggaacgc cagctggagc 
atgacctgga tggagtggga gcgcgagatc 
atcgaggaga gccagaacca gcaggagaag 
tgggccagcc tgtggaactg gttcgacatc 
atcatgatcg tgggcggcct ggtgggcctg 
aaccgcgtgc gccagggcta cagccccctg 
ggccccgacc gccccgaggg catcgaggag 
agccccctgg tgcacggcct gctggccctg 
ttcagctacc accgcctgcg cgacctgatc 
ggcegccgcg gctgggaggc cctgaagtac 
gagctgaaga acagcgccgt gagcctgttc 
accgaccgca tcatcgaggt ggcccagcgc 
cgcatccgcc agggcttcga gcgcgccctg 



atcaccatcg gccccggccg cgccttctac 720 
caggcccact gcaacatcag cggcgagaag 780 
aagctgcagg cccagttcgg caacaagacc 84 0 
cccgagatcg tgatgcacag cttcaactgc 900 
cagctgttca acagcacctg gaacaacacc 960 
accctgccct gccgcatcaa gcagatcatc 1020 
ccccccatcc gcggccagat ccgctgcagc 1080 
gacggcggca aggagatcag caacaccacc 114 0 
cgcgacaact ggcgcagcga gctgtacaag 12 00 
gtggccccca ccaaggccaa gcgccgcgtg 12 60 
ggcgccatgt tcctgggctt cctgggcgcc 1320 
accctgaccg tgcaggcccg ccagctgctg 1380 
ctgcgcgcca tcgaggccca gcagcacctg 144 0 
ctgcaggccc gcgtgctggc cgtggagcgc 1500 
tggggctgca gcggcaagct gatctgcacc 1560 
aacaagagcc tggaccagat ctggaacaac 1620 
gacaactaca ccaacctgat ctacaccctg 1680 
aacgagcagg agctgctgga gctggacaag 174 0 
agcaagtggc tgtggtacat caagatcttc 1800 
cgcatcgtgt tcaccgtgct gagcatcgtg 1860 
agcttccaga cccgcttccc cgccccccgc 1920 
gagggcggcg agcgcgaccg cgaccgcagc 1980 
atctgggacg acctgcgcag cctgtgcctg 2040 
ctgatcgccg cccgcatcgt ggagctgctg 2100 
tggggcaacc tgctgcagta ctggatccag 2160 
gacgccatcg ccatcgccgt ggccgagggc 2220 
atcggccgcg ccttcctgca catcccccgc 2280 
ctgtaactcg ag 2322 



<210> 21 
<211> 2310 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
Asn425-Lys432 



Lysl21-Val200; 



<400> 21 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

cccgtgatca 

gcccccgccg 

tgcaccaacg 

ctgctgctga 

gacaacgcca 

cccaacaaca 

ggcgacatca 

aacaccctga 

ttcaagcaga 

gagttcttct 

cccaacaaca 

cccaaggcca 

ggcctgctgc 

cccggcggcg 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
cccaggcctg 
gcttcgccat 
tgagcaccgt 
acggcagcct 
agaccatcat 
acacccgcaa 
tcggcgacat 
agcagatcgt 
gcagcggcgg 
actgcaacag 
ccaacggcac 
tgtacgcccc 
tgacccgcga 
gcgacatgcg 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
ccccaaggtg 
cctgaagtgc 
gcagtgcacc 
ggccgaggag 
cgtgcagctg 
gagcatcacc 
ccgccaggcc 
gaccaagctg 
cgaccccgag 
cacccagctg 
catcaccctg 
ccccatccgc 
cggcggcaag 
cgacaactgg 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
agcttcgagc 
aacgacaaga 
cacggcatcc 
ggcgtggtga 
aaggagagcg 
atcggccccg 
cactgcaaca 
caggcccagt 
atcgtgatgc 
ttcaacagca 
ccctgccgca 
ggccagatcc 
gagatcagca 
cgcagcgagc 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
ccatccccat 
agttcaacgg 
gccccgtggt 
tccgcagcga 
tggagatcaa 
gccgcgcctt 
tcagcggcga 
tcggcaacaa 
acagcttcaa 
cctggaacaa 
tcaagcagat 
gctgcagcag 
acaccaccga 
tgtacaagta 



gctgtgtgga 
ctacggcgtg 
ggcctacgac 
caacccccag 
catggtggag 
cgtgaaggcc 
ccactactgc 
cagcggcccc 
gagcacccag 
gaacttcacc 
ctgcacccgc 
ctacgccacc 
gaagtggaac 
gaccatcgtg 
ctgcggcggc 
caccatcggc 
catcaacgcc 
caacatcacc 
gatcttccgc 
caaggtggtg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 



22 



WO 00/39303 



PCT/US99/31272 



aagatcgagc ccctgggcgt ggcccccacc aaggccaagc gccgcgtggt gcagcgcgag 1260 
aagcgcgccg tgaccctggg cgccatgttc ctgggcttcc tgggcgccgc cggcagcacc 1320 
atgggcgccc gcagcctgac cctgaccgtg caggcccgcc agctgctgag cggcatcgtg 13 80 
cagcagcaga acaacctgct gcgcgccatc gaggcccagc agcacctgct gcagctgacc 144 0 
gtgtggggca tcaagcagct gcaggcccgc gtgctggccg tggagcgcta cctgaaggac 1500 
cagcagctgc tgggcatctg gggctgcagc ggcaagctga tctgcaccac cgccgtgccc 1560 
tggaacgcca gctggagcaa caagagcctg gaccagatct ggaacaacat gacctggatg 1620 
gagtgggagc gcgagatcga caactacacc aacctgatct acaccctgat cgaggagagc 1680 
cagaaccagc aggagaagaa cgagcaggag ctgctggagc tggacaagtg ggccagcctg 174 0 
tggaactggt tcgacatcag caagtggctg tggtacatca agatcttcat catgatcgtg 1800 
ggcggcctgg tgggcctgcg catcgtgttc accgtgctga gcatcgtgaa ccgcgtgcgc 1860 
cagggctaca gccccctgag cttccagacc cgcttccccg ccccccgcgg ccccgaccgc 1920 
cccgagggca tcgaggagga gggcggcgag cgcgaccgcg accgcagcag ccccctggtg 1980 
cacggcctgc tggccctgat ctgggacgac ctgcgcagcc tgtgcctgtt cagctaccac 2040 
cgcctgcgcg acctgatcct gatcgccgcc cgcatcgtgg agctgctggg ccgccgcggc 2100 
tgggaggccc tgaagtactg gggcaacctg ctgcagtact ggatccagga gctgaagaac 2160 
agcgccgtga gcctgttcga cgccatcgcc atcgccgtgg ccgagggcac cgaccgcatc 222 0 
atcgaggtgg cccagcgcat cggccgcgcc ttcctgcaca tcccccgccg catccgccag 2280 
ggcttcgagc gcgccctgct gtaactcgag 2310 

<210> 22 
<211> 2298 
<212> DNA 

<213> Artificial Sequence 
<220> ' 

<223> Description of Artificial Sequence: Vall20-Ile201; 
Ile424-Ala433 



<400> 22 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
atcacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 54 0 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 72 0 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 84 0 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcgg cggcgccatg 1020 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1080 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 114 0 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1200 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 126 0 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 1320 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 13 80 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 144 0 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1500 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 156 0 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 162 0 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1680 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 174 0 
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gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1800 
ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1860 
cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 1920 
gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 1980 
gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 204 0 
ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2100 
aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2160 
ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 2220 
cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2280 
gccctgctgt aactcgag 2298 

<210> 23 
<211> 2298 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Vall20-Ile201B; Ile424 -Ala4 33 



<400> 23 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
atcacccagg cctgccccaa ggtgagcttc 
gccggcttcg ccatcctgaa gtgcaacgac 
aacgtgagca ccgtgcagtg cacccacggc 
ctgaacggca gcctggccga ggagggcgtg 
gccaagacca tcatcgtgca gctgaaggag 
aacaacaccc gcaagagcat caccatcggc 
atcatcggcg acatccgcca ggcccactgc 
ctgaagcaga tcgtgaccaa gctgcaggcc 
cagagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acagcaccca gctgttcaac 
aacaccaacg gcaccatcac cctgccctgc 
tacgcccccc ccatccgcgg ccagatccgc 
acccgcgacg gcggcaagga gatcagcaac 
gacatgcgcg acaactggcg cagcgagctg 
ctgggcgtgg cccccaccaa ggccaagcgc 
accctgggcg ccatgttcct gggcttcctg 
agcctgaccc tgaccgtgca ggcccgccag 
aacctgctgc gcgccatcga ggcccagcag 
aagcagctgc aggcccgcgt gctggccgtg 
ggcatctggg gctgcagcgg caagctgatc 
tggagcaaca agagcctgga ccagatctgg 
gagatcgaca actacaccaa cctgatctac 
gagaagaacg agcaggagct gctggagctg 
gacatcagca agtggctgtg gtacatcaag 
ggcctgcgca tcgtgttcac cgtgctgagc 
cccctgagct tccagacccg cttccccgcc 
gaggaggagg gcggcgagcg cgaccgcgac 
gccctgatct gggacgacct gcgcagcctg 
ctgatcctga tcgccgcccg catcgtggag 
aagtactggg gcaacctgct gcagtactgg 
ctgttcgacg ccatcgccat cgccgtggcc 
cagcgcatcg gccgcgcctt cctgcacatc 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 3 00 
gaccagagcc tgaagccctg cgtgcccggc 360 
gagcccatcc ccatccacta ctgcgccccc 420 
aagaagttca acggcagcgg cccctgcacc 4 80 
atccgccccg tggtgagcac ccagctgctg 540 
gtgatccgca gcgagaactt caccgacaac 600 
agcgtggaga tcaactgcac ccgccccaac 660 
cccggccgcg ccttctacgc caccggcgac 720 
aacatcagcg gcgagaagtg gaacaacacc 780 
cagttcggca acaagaccat cgtgttcaag 840 
atgcacagct tcaactgcgg cggcgagttc 900 
agcacctgga acaacaccat cggccccaac 960 
cgcatcaagc agatcatcgg cggcgccatg 1020 
tgcagcagca acatcaccgg cctgctgctg 1080 
accaccgaga tcttccgccc cggcggcggc 1140 
tacaagtaca aggtggtgaa gatcgagccc 1200 
cgcgtggtgc agcgcgagaa gcgcgccgtg 1260 
ggcgccgccg gcagcaccat gggcgcccgc 1320 
ctgctgagcg gcatcgtgca gcagcagaac 13 80 
cacctgctgc agctgaccgt gtggggcatc 1440 
gagcgctacc tgaaggacca gcagctgctg 1500 
tgcaccaccg ccgtgccctg gaacgccagc 1560 
aacaacatga cctggatgga gtgggagcgc 1620 
accctgatcg aggagagcca gaaccagcag 1680 
gacaagtggg ccagcctgtg gaactggttc 1740 
atcttcatca tgatcgtggg cggcctggtg 1800 
atcgtgaacc gcgtgcgcca gggctacagc 1860 
ccccgcggcc ccgaccgccc cgagggcatc 1920 
cgcagcagcc ccctggtgca cggcctgctg 1980 
tgcctgttca gctaccaccg cctgcgcgac 2040 
ctgctgggcc gccgcggctg ggaggccctg 2100 
atccaggagc tgaagaacag cgccgtgagc 2160 
Qagggcaccg accgcatcat cgaggtggcc 2220 
ccccgccgca tccgccaggg cttcgagcgc 2280 
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gccctgctgt aactcgag 2298 

<210> 24 
<211> 2298 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20-Thr202 ; 
Ile424-Ala433 



<400> 24 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
gccacccagg cctgccccaa ggtgagcttc 
gccggcttcg ccatcctgaa gtgcaacgac 
aacgtgagca ccgtgcagtg cacccacggc 
ctgaacggca gcctggccga ggagggcgtg 
gccaagacca tcatcgtgca gctgaaggag 
aacaacaccc gcaagagcat caccatcggc 
atcatcggcg acatccgcca ggcccactgc 
ctgaagcaga tcgtgaccaa gctgcaggcc 
cagagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acagcaccca gctgttcaac 
aacaccaacg gcaccatcac cctgccctgc 
tacgcccccc ccatccgcgg ccagatccgc 
acccgcgacg gcggcaagga gatcagcaac 
gacatgcgcg acaactggcg cagcgagctg 
ctgggcgtgg cccccaccaa ggccaagcgc 
accctgggcg ccatgttcct gggcttcctg 
agcctgaccc tgaccgtgca ggcccgccag 
aacctgctgc gcgccatcga ggcccagcag 
aagcagctgc aggcccgcgt gctggccgtg 
ggcatctggg gctgcagcgg caagctgatc 
tggagcaaca agagcctgga ccagatctgg 
gagatcgaca actacaccaa cctgatctac 
gagaagaacg agcaggagct gctggagctg 
gacatcagca agtggctgtg gtacatcaag 
ggcctgcgca tcgtgttcac cgtgctgagc 
cccctgagct tccagacccg cttccccgcc 
gaggaggagg gcggcgagcg cgaccgcgac 
gccctgatct gggacgacct gcgcagcctg 
ctgatcctga tcgccgcccg catcgtggag 
aagtactggg gcaacctgct gcagtactgg 
ctgttcgacg ccatcgccat cgccgtggcc 
cagcgcatcg gccgcgcctt cctgcacatc 
gccctgctgt aactcgag 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgggcggc 360 
gagcccatcc ccatccacta ctgcgccccc 420 
aagaagttca acggcagcgg cccctgcacc 480 
atccgccccg tggtgagcac ccagctgctg 54 0 
gtgatccgca gcgagaactt caccgacaac 600 
agcgtggaga tcaactgcac ccgccccaac 660 
cccggccgcg ccttctacgc caccggcgac 72 0 
aacatcagcg gcgagaagtg gaacaacacc 780 
cagttcggca acaagaccat cgtgttcaag 84 0 
atgcacagct tcaactgcgg cggcgagttc 900 
agcacctgga acaacaccat cggccccaac 960 
cgcatcaagc agatcatcgg cggcgccatg 1020 
tgcagcagca acatcaccgg cctgctgctg 1080 
accaccgaga tcttccgccc cggcggcggc 114 0 
tacaagtaca aggtggtgaa gatcgagccc 12 00 
cgcgtggtgc agcgcgagaa gcgcgccgtg 1260 
ggcgccgccg gcagcaccat gggcgcccgc 1320 
ctgctgagcg gcatcgtgca gcagcagaac 1380 
cacctgctgc agctgaccgt gtggggcatc 1440 
gagcgctacc tgaaggacca gcagctgctg 1500 
tgcaccaccg ccgtgccctg gaacgccagc 1560 
aacaacatga cctggatgga gtgggagcgc 1620 
accctgatcg aggagagcca gaaccagcag 1680 
gacaagtggg ccagcctgtg gaactggttc 1740 
atcttcatca tgatcgtggg cggcctggtg 1800 
atcgtgaacc gcgtgcgcca gggctacagc 1860 
ccccgcggcc ccgaccgccc cgagggcatc 192 0 
cgcagcagcc ccctggtgca cggcctgctg 1980 
tgcctgttca gctaccaccg cctgcgcgac 2040 
ctgctgggcc gccgcggctg ggaggccctg 2100 
atccaggagc tgaagaacag cgccgtgagc 2160 
gagggcaccg accgcatcat cgaggtggcc 222 0 
ccccgccgca tccgccaggg cttcgagcgc 2280 

2298 



<210> 25 

<211> 2358 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Vall27-Asnl95 
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<400> 25 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggggc agggaactgc aacaccagcg tgatcaccca ggcctgcccc 420 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 4 80 
aagtgcaacg acaagaagtt caacggcagc ggcccctgca ccaacgtgag caccgtgcag 540 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggagggcg tggtgatccg cagcgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 720 
atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 
caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 840 
aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 
cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 
cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 1020 
accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 1080 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 114 0 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1200 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1260 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1320 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 13 80 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 144 0 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1500 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1560 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1620 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1680 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 174 0 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1800 
gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1860 
ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1920 
cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 1980 
gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 204 0 
gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 2100 
ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2160 
aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2220 
ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 2280 
cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2340 
gccctgctgt aactcgag 2358 

<210> 26 
<211> 2352 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall27 -Asnl95 ; 
Arg4 2 6-Gly4 31 

<400> 26 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggggc agggaactgc aacaccagcg tgatcaccca ggcctgcccc 420 
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aaggtgagct tcgagcccat ccccatccac 
aagtgcaacg acaagaagtt caacggcagc 
tgcacccacg gcatccgccc cgtggtgagc 
gaggagggcg tggtgatccg cagcgagaac 
cagctgaagg agagcgtgga gatcaactgc 
atcaccatcg gccccggccg cgccttctac 
caggcccact gcaacatcag cggcgagaag 
aagctgcagg cccagttcgg caacaagacc 
cccgagatcg tgatgcacag cttcaactgc 
cagctgttca acagcacctg gaacaacacc 
accctgccct gccgcatcaa gcagatcatc 
ccccccatcc gcggccagat ccgctgcagc 
gacggcggca aggagatcag caacaccacc 
cgcgacaact ggcgcagcga gctgtacaag 
gtggccccca ccaaggccaa gcgccgcgtg 
ggcgccatgt tcctgggctt cctgggcgcc 
accctgaccg tgcaggcccg ccagctgctg 
ctgcgcgcca tcgaggccca gcagcacctg 
ctgcaggccc gcgtgctggc cgtggagcgc 
tggggctgca gcggcaagct gatctgcacc 
aacaagagcc tggaccagat ctggaacaac 
gacaactaca ccaacctgat ctacaccctg 
aacgagcagg agctgctgga gctggacaag 
agcaagtggc tgtggtacat caagatcttc 
cgcatcgtgt tcaccgtgct gagcatcgtg 
agcttccaga cccgcttccc cgccccccgc 
9 a 999 c 99 c 9 agcgcgaccg cgaccgcagc 
atctgggacg acctgcgcag cctgtgcctg 
ctgatcgccg cccgcatcgt ggagctgctg 
tggggcaacc tgctgcagta ctggatccag 
gacgccatcg ccatcgccgt ggccgagggc 
atcggccgcg ccttcctgca catcccccgc 
ctgtaactcg ag 
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tactgcgccc ccgccggctt cgccatcctg 480 
ggcccctgca ccaacgtgag caccgtgcag 54 0 
acccagctgc tgctgaacgg cagcctggcc 600 
ttcaccgaca acgccaagac catcatcgtg 660 
acccgcccca acaacaacac ccgcaagagc 720 
gccaccggcg acatcatcgg cgacatccgc 780 
tggaacaaca ccctgaagca gatcgtgacc 84 0 
atcgtgttca agcagagcag cggcggcgac 900 
ggcggcgagt tcttctactg caacagcacc 960 
atcggcccca acaacaccaa cggcaccatc 1020 
aaccgcggcg gcggcaaggc catgtacgcc 1080 
agcaacatca ccggcctgct gctgacccgc 114 0 
gagatcttcc gccccggggg cggcgacatg 1200 
tacaaggtgg tgaagatcga gcccctgggc 1260 
gtgcagcgcg agaagcgcgc cgtgaccctg 1320 
gccggcagca ccatgggcgc ccgcagcctg 13 80 
agcggcatcg tgcagcagca gaacaacctg 144 0 
ctgcagctga ccgtgtgggg catcaagcag 1500 
tacctgaagg accagcagct gctgggcatc 1560 
accgccgtgc cctggaacgc cagctggagc 162 0 
atgacctgga tggagtggga gcgcgagatc 1680 
atcgaggaga gccagaacca gcaggagaag 174 0 
tgggccagcc tgtggaactg gttcgacatc 1800 
atcatgatcg tgggcggcct ggtgggcctg 1860 
aaccgcgtgc gccagggcta cagccccctg 1920 
ggccccgacc gccccgaggg catcgaggag 1980 
agccccctgg tgcacggcct gctggccctg 2040 
ttcagctacc accgcctgcg cgacctgatc 2100 
ggccgccgcg gctgggaggc cctgaagtac 2160 
gagctgaaga acagcgccgt gagcctgttc 2220 
accgaccgca tcatcgaggt ggcccagcgc 2280 
cgcatccgcc agggcttcga gcgcgccctg 234 0 

2352 
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